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Abstract 

A  new  approach  to  separating  convolved  signals,  referred  to  as  homomorphic  decon¬ 
volution,  is  presented.  The  class  of  systems  considered  in  this  report  is  a  member  of 
a  larger  class  called  homomorphic  systems,  which  are  characterized  by  a  generalized 
principle  of  superposition  that  is  analogous  to  the  principle  of  superposition  for  linear 
systems. 

A  detailed  analysis  based  on  the  z-transform  is  given  for  discrete-time  systems  of 
this  class.  The  realization  of  such  systems  using  a  digital  computer  is  also  discussed 
in  detail.  Such  computational  realizations  are  made  possible  through  the  application  of 
high-speed  Fourier  analysis  techniques. 

As  a  particular  example,  the  method  is  applied  to  the  separation  of  the  compo¬ 
nents  of  a  convolution  in  which  one  of  the  components  is  an  impulse  train.  This  class 
of  signals  is  representative  of  many  interesting  signal-analysis  and  signal-processing 
problems  such  as  speech  analysis  and  echo  removal  and  detection.  It  is  shown  that 
homomorphic  deconvolution  is  a  useful  approach  to  either  removal  or  detection  of 
echoes. 
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1.  INTRODUCTION 


In  many  physical  situations,  we  encounter  signals  or  waveforms  that  may  be  repre¬ 
sented  as  the  convolution  of  two  or  more  components.  One  cl  ss  of  these  problems 
arises  when  a  signal  is  distorted  by  transmission  through  a  linear  system.  For  example, 
the  effects  of  multipath  and  reverberation  may  be  modeled  in  terms  of  a  signal  that  is 
passed  through  a  linear  system  whose  impulse  response  is  an  impulse  train.  In  this  case 
we  may  be  interested  either  in  recovering  the  undistorted  signal  or  in  determining  the 
parameters  of  the  impulse  response.  A  similar  class  of  problems  arises  when  we  are 
given  a  waveform  that  can  be  represented  as  a  convolution  of  two  or  more  component 
signals,  and  we  may  wish  to  determine  these  components  so  as  to  characterize  the  wave¬ 
form  or  the  physical  process  from  which  it  originated.  1-  example,  certain  segments 
of  speech  waveforms  may  be  represented  as  the  convolution  of  several  components. 
Most  speech  bandwidtn-compression  schemes  are  based  on  the  determination  of  the 
parameters  of  these  component  waveforms. 

The  process  of  separating  the  components  of  a  convolution  is  termed  deconvolution. 

In  performing  deconvolution  of  a  waveform  we  must  determine  an  appropriate  transfor¬ 
mation  of  the  waveform  into  the  desired  component  waveform.  A  common  method  of 
deconvolution  is  called  inverse  filtering.  Iu  this  method,  the  signal  is  transformed  by 
a  linear  time -invariant  system  whose  system  function  is  the  reciprocal  of  the  Fourier 

transform  of  the  components  to  be  removed.  Although  inverse  filtering  has  been  sue- 
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cessfully  applied  in  processing  many  different  types  of  signals,  ’  it  is  limited  by  the 
necessity  of  knowing  the  signal  to  be  removed,  as  well  as  having  a  sensitivity  to  additive 
noise.  Another  deconvolution  technique  is  based  on  the  Wiener  theory  of  linear  filtering. 
This  technique  has  been  extensively  applied  in  processing  seismic  waveforms.^  In  detec- 

g 

tion  of  echoes,  maximum-likelihood  methods  and  correlation  have  been  used.  Variour 
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other  techniques  have  been  developed  for  special  situations.  *  It  is  difficult  to  compare 
the  various  methods  of  deconvolution  because  generally  each  method  requires  different 
information  about  the  signals  and  the  objectives  of  each  method  are  not  precisely  the 
same.  Nevertheless,  it  is  clear  that  there  is  not  a  single  best  method  that  can  be  applied 
to  all  deconvolution  problems.  Given  the  importance  of  the  problem  of  deconvolution, 
it  seems  that  even  though  a  variety  of  methods  are  available,  at  present,  it  is  cogent  to 
investigate  other  approaches.  The  detailed  consideration  of  a  new  approach  to  deconvo¬ 
lution  is  therefore  the  subject  oi  this  report. 

The  approach  to  deconvolution  presented  here  was  originally  proposed  by  Professor 
Alan  V.  Oppenheim  as  an  application  of  the  theory  of  generalized  superposition.  ’  The 
parallel  development  of  the  applications  of  this  technique  to  speech  analysis1**'  20  by 

Q 

Oppenheim,  and  to  echo  removal  by  the  author  led  to  the  theoretical  formulation  of  the 
technique  presented  in  this  report. 

Our  purpose  is  to  give  a  detailed  discussion  of  the  characteristics  of  this  new 
approach  to  separating  convolved  signals.  Since  it  appears  that  digital  realizations  of 
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this  signal-processing  method  are  most  promising,  ovr  analysis  will  be  confined  to 
discrete-time  signals  and  will  be  based  on  the  z-transform.  We  shall  also  investigate 
carefully  the  actual  realization  of  technique  in  the  form  of  algorithms  for  a  digital  com¬ 
puter.  As  an  example  cf  the  use  of  this  technique,  we  have  considered  the  problem  of 
deconvolution  for  the  class  of  signals  that  are  represented  as  the  convolution  of  one  or 
more  waveforms  with  an  impulse  train.  This  kind  of  representation  is  characteristic 
of  the  waveforms  of  speech  and  music  and  many  other  acoustic  disturbances.  Also, 
seismic  signals,  sonar  signals,  and  many  biological  signals  are  in  this  class.  In  fact, 
any  signal  that  is  quasi-periodic  by  nature,  or  any  signal  that  has  been  transmitted 
through  a  reverberant  environment  will  have  such  a  representation. 

We  shall  now  review  the  theory  of  generalized  superposition,  its  relation  to 
"cepstral"  analysis,*0-1^  and  its  application  to  deconvolution.  In  Section  II  a  detailed 
analysis  of  the  technique  will  be  presented,  and  in  Section  III  we  shall  focus  on  compu¬ 
tational  considerations.  In  the  rest  of  the  report  we  shall  discuss  applications  to  speech 
processing  and  to  echo  removal  and  detection. 


1.  1  GENERALIZED  SUPERPOSITION 

A  system  is  often  defined  abstractly  as  a  unique  transformation  of  an  input  signal 
or  waveform  x  into  an  output  signal  y.  The  signals  are  represented  by  functions  of 
time,  and  the  system  corresponds  to  the  mathematical  concept  of  an  operator.  Such 
transformations  are  denoted  by 


y  =  T[x], 

In  order  to  characterize  and  classify  systems,  we  place  restrictions  on  the  form 
of  the  operator  T[  ],  For  example,  the  class  of  linear  systems  is  characterized  by  the 
property 

T[eJtj+bx2]  =  aT[xJ  +  bT(x2].  (1) 

Similarly,  the  class  of  time-invariant  systems  is  characterized  by  the  property  that  if 


T[x(t)]  =  y(t), 

then 

T[x(t+to)]  =  y(t+tQ),  (2) 


The  class  of  linear  time -invariant  (LTI)  systems  has  both  of  the  properties 

expressed  by  Eqa.  1  and  2.  As  a  direct  consequence  of  these  properties,  it  can  be 
27 

shown  that  all  LTI  systems  are  described  by  the  convolution  integral 


y(l)  =  f  x(T)  h{t-T)  dT  s  f  h(T)  x(t-T)  dT, 
J-00  ^-30 


(3) 


where  y(t)  is  the  output,  x(t)  is  the  input,  and  h(t>  is  the  response  of  the  system 


2 


to  a  unit  impulse.  The  class  of  LTI  systems  is  very  important  for  three  basic  reasons. 

1.  Linear  time- invariant  systems  are  rather  easy  to  analyze  and  characterize. 

2.  It  is  possible  to  design  linear  systems  to  perform  a  large  variety  of  useful 
functions. 

3.  Many  naturally  occurring  phenomena  are  accurately  modeled  using  linear  sys¬ 
tem  theory. 

The  first  of  these  comments  is  primarily  a  consequence  of  the  principle  of  super¬ 
position  (Eq.  1)  which  character,  lzes  linear  systems.  In  particular,  when  ihe  input 
is  a  sum  of  component  signals,  a  linear  system  is  very  convenient  for  separating 
one  component  from  the  other.  As  we  shall  see,  our  approach  to  deconvolution  is 
motivated  by  similar  considerations. 

Classes  of  systems  are  defined  by  placing  restrictions  on  the  transformation  that 
represents  the  system.  To  state  that  a  system  is  nonlinear  does  nothing  to  charade  ize 
the  properties  of  that  system.  An  approach  to  characterizing  nonlinear  systems  which 
is  based  on  linear  algebra  has  been  presented  bv  Oppenheim. 1  In  this  approach  it  is 
recognized  that  vector  spaces  of  time  functions  at  the  input  and  output  of  a  system 
can  be  constructed  with  a  variety  of  defir  ltd  ij  of  vector  addition  and  scalar  multipli¬ 
cation.  Thu3  many  nonlinear  systems  can  'c  r  ip.c-oented  as  linear  transformations 
between  vector  spaces  and  can  thus  be  said  to  >!  oy  a  generalized  principle  of  super¬ 
position.  Nonlinear  systems  of  this  type  have  been  called  homomorphic  systems  to 
emphasize  the  fact  that  they  are  represe- •(,<  c!  by  algebraically  linear  transformations. 

If  we  take  the  operations  of  vector  addition  to  be  the  same  in  the  input  and  output  spaces, 
then  a  generalization  of  the  linear  filtering  problem  follows.  This  approach  applied  to 
the  separation  of  convolved  signals  is  appropriately  termed  homomorphic  deconvolution. 


0 


y  =  H  tx] 

=  H Cx ,3  ®H[x2] 

Fig.  1.  Representation  of  a  homomorphic  system  thal  obeys  a 
generalized  principle  of  superposition  for  convolution. 


0 

H 

X  =x1  ®  x2 

Tile  class  of  homomorphic  systems  of  interest  for  deconvolution  is  one  in  which 
vector  addition  i?  defined  as  convolution.  A  system  of  this  class  is  shown  in  Fig.  1. 
The  system  H  is  characterized  by  the  fact  that  if 

Hfxi] =  yi  and  hIx2]  =  y2> 

tS:  en 
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(4) 


Hpa)Xj  $  (b)x2]  =  (a)H[xj]  ®  (b)H{x2)  =  (a)yj  ®  (b)y2> 

where  ®  denotes  convolution,  and  ^  denotes  scalar  multiplication.  (The  meaning  of 
scalar  multiplication  is  discuused  in  the  Appendix.)  Comparison  of  Eqs.  1  and  4  should 
suffice  to  show  why  we  use  the  term  "generalized  superposition."  it  has  been  shown* 
that  all  homomorphic  systems  have  a  canonic  representation  as  the  cascade  of  a  non¬ 
linear  system  followed  by  a  lines r  system  and  then  another  nonlinear  system.  For  con¬ 
volutional  input  and  output  spaces,  this  canonic  form  is  shown  in  Fig.  2.  The  system  D 
is  a  homomorphic  transformation  from  a  convolutional  space  to  an  additive  space  so  that 
if  D[xJ  =  Xj  and  D[x2]  =  x2,  then 

Dp^Xj  ®  \]  =  aD[xJ  bD[x2]  =  axj  +  bx2> 

The  system  L  is  a  linear  system  in  the  conventional  sense  so  that  if 
L[xJ  =  yj  and  r.,[x2]  =  y,, 

then 

L[axj+bx2]  =  aL[Sj]  +  bl,[x2]  =  apj  +  by2. 

The  system  D~*  is  the  inverse  of  the  system  D  and  it  serves  to  transform  from  the 
additive  space  of  L  back  to  the  convolutional  space. 


H 

Fig.  2.  Canonic  form  for  homomorphic  deconvolution. 

The  canonic  representation  is  extremely  important.  All  homomorphic  systems  with 
convolution  for  both  input  and  output  operations  have  the  same  form  and  differ  only  in 
the  linear  part,  L.  This  is  the  reason  for  referring  to  Fig.  2  as  a  canonic  representa¬ 
tion.  It  should  be  clear  that  such  a  representation  allows  us  to  study  such  systems  by 
first  focusing  our  attention  on  the  system  D,  and  then  applying  the  well-developed  tech¬ 
niques  of  linear  system  theory  to  aid  in  understanding  a  particular  over-all  system  H. 
For  example,  if  we  are  interested  in  designing  a  homomorphic  system  for  recovering 
signal  Xj  from  the  convolution  x  -  Xj  ®  x,,  we  need  to  choose  the  system  L  so  that  x2 
is  removed  from  the  additive  combination  existing  at  the  output  of  D. 

The  system  D  depends  entirely  on  the  specific  operation  for  combining  signals  at 


4 


the  input  and  thus  is  the  same  for  all  homomorphic  systems  for  deconvolution.  For 
this  reason,  the  system  D  is  called  the  characteristic  system  'or  homomorphic 
deconvolution. 

The  nature  of  the  transformation  D  is  suggested  by  considering  the  Fourier  trans¬ 
forms  of  x  and  x.  Suppose 

x  =  Xj  ®  x2> 

so  that  the  Fourier  transform  of  x  is 

X  =  Xj  •  X2,  (5) 

where  X,  Xj,  and  X2  are  the  Fourier  transforms  of  x,  Xj,  and  x2>  We  see  also  that 
the  Fourier  transform  of  x  must  be  of  the  form 

A  A  A 

X  =  Xj  +  X2.  (6) 

Equations  5  and  6  suggest  that  under  an  appropriate  definition  of  the  logarithm,  we  might 
define  the  system  D  to  be  the  system  whose  output  Fourier  transform  is  the  complex 
logarithm  of  the  transform  of  the  input;  that  is, 

X  =  log  [X]. 

Furthermore,  this  suggests  the  method  of  realizing  the  transformation  D  shown  in 
Fig.  3. 

Thus  homomorphic  deconvolution  is  based  on  transforming  a  convolution  into  a  sum 
and  then  using  a  linear  system  to  separate  the  additive  components.  The  result  is  then 
transformed  back  to  the  original  input  space. 


I _ ! 

D 


Fig.  3.  Formal  realization  of  the  characteristic  system 
for  homomorphic  deconvolution. 


We  have  chosen  for  investigation,  as  examples  of  the  application  of  homomorphic 
deconvolution,  the  class  of  signals  that  can  be  repre  sented  as  a  convolution  in  which 
one  of  the  components  is  an  impulse  train.  As  an  example  of  this  class  consider 

x(t)  =  s(t)  +  as(t-tQ)  =  (uo(t)+auo(t-to)]  $  s(t). 

The  Fourier  transform  of  this  equation  is 
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=  S(«)[ 


1  +  a  e 


The  complex  logarithm  is  formally 


X»)  =  log  [S<«)]  +  log 


We  note  that  the  second  term  in  this  expression  is  periodic  in  w  with  a  repetition  rate 
proportional  to  tQ. 

Suppose  we  view  log  [X(o)]  as  a  waveform  to  be  filtered  with  a  linear  system.  We 

(  'iuto\ 

note  that  if  the  spectra  of  log  [S(w)]  and  log  yl  +  e  /do  not  overlap,  the  separation 
of  the  two  components  is  relatively  easy.  Alternatively,  we  require  that  the  term 
(  'juto\ 

log  \l  +  ae  )  vary  rapidly,  compared  with  the  variations  in  log  [S(u)].  Thus  we  see 
that  the  transformation  D  allows  us  to  transform  a  convolution  of  waveforms  into  a  sum 
that,  under  appropriate  conditions,  can  be  separated  by  a  linear  system.  This  allows  one 
who  is  familiar  with  linear  system  theory  to  apply  all  of  his  experience  and  intuition  to 
this  technique  of  deconvolution  simply  by  focusing  his  attention  on  the  log  of  the  Fourier 
transform  and  interchanging  the  roles  of  time  and  frequency. 

1.  2  THE  CEPSTRUM 


Independently  of  Oppenheim's  formulation  of  the  theory  of  homomorphic  systems  and 
our  subsequent  work,  Bogert,  Healy,  and  Tukey10  recognized  that  the  logarithm  of  the 
power  spectrum  {the  Fourier  transform  of  the  autocorrelation  function)  for  a  signal  con¬ 
taining  an  echo  should  have  a  periodic  component  whose  repetition  rate  is  related  to  the 
echo  delay.  Thus  the  power  spectrum  of  the  logarithm  of  the  power  spectrum  should 

exhibit  a  peak  at  the  echo  delay  time.  This  function  was  called  the  "cepstrum"  by  trans- 
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posing  some  letters  of  the  word  "spectrum. "  Noll  has  traced  the  evolution  of  cepstrai 
analysis  and  also  discussed  various  definitions  of  the  cepstrum  which  have  been 
employed.  Although  cepstrai  methods  have  been  developed  from  an  empirical  point  of 
view,  we  can  see  that  the  cepstrur  is  clearly  related  to  homomorphic  deconvolution.  The 
basic  difference  is  that  we  shall  employ  a  Fourier  transform  (magnitude  and  phase), 
rather  than  the  power  or  the  energy  spectrum.  We  do  this  because  we  are  concerned 
with  the  more  general  problem  of  recovery  of  signals  as  opposed  to  detection  of  echoes. 
To  emphasize  this  distinction,  we  shall  refer  to  the  output  of  the  characteristic  sys¬ 
tem  D  as  the  complex  cepstrum. 
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II.  ANALYSIS  OF  DISCRETE-TIME  HOMOMORPHIC  DECONVOLUTION 


We  have  introduced  the  concept  of  systems  that  obey  a  generalized  principle  of  super¬ 
position  in  which  addition  is  replaced  by  convolution.  Since  it  appears,  at  present,  that 
such  systems  can  be  most  easily  realized  digitally,  we  shall  be  concerned  henceforth  only 
with  discrete -time  systems  of  this  class.  Thus  our  signal  vectors  are  sequences  of 
numbers,  and  convolution  is  defined  as 


(7) 


The  canonic  form  for  discrete -time  homomorphic  systems  is  shown  in  Fig.  4,  where 
x  is  the  input  sequence,  and  x  is  the  complex  cepstrum.  The  system  D  characterizes 
all  systems  of  this  class.  Therefore  we  shall  begin  our  study  of  such  systems  with  a 
study  of  the  system  D,  and  then  consider  the  choice  of  the  linear  system  L. 


Fig.  4.  Canonic  form  for  discrete-time  homomorphic  deconvolution. 


The  properties  of  the  transformation  D  can  best  be  analyzed  by  considering  the  z- 
transforms  of  x  and  x.2^’2*1  If  x  is  a  convolution, 

x  =  Xj  ®  x2, 

then 

X(z)  =  Xj(z)  •  X2(z).  (8) 

(Note  that  ®  denotes  discrete -time  convolution  as  in  Eq.  7.)  We  require  that  if  x  is  a 
convolution  as  in  Eq.  7,  then 

/\  A  A 

x=  Xj  +x2. 

Thus  the  z -transform  of  x  must  be  of  the  form 

X(z)  =  X,(z)  +  X2(z).  (9) 

If  we  compare  (Q)  and  (9),  we  see  that  the  requirement  is  that  the  system  D  effectively 


transform  a  product  of  z -transforms  into  a  sum  of  corresponding  z -transforms.  We 
shall  show  that,  under  appropriate  definition  of  the  complex  logarithm, 

log  [Xj(z)X2(z)]  =  log  [Xj(z)]  +  log  [X2(z)l 

Thus  we  are  led  to  define  the  system  D  as  one  for  which  the  z -transform  of  the  output 
is  the  complex  logarithm  of  the  z -transform  of  the  input.  That  is, 

00 

X(z)  =  £  x(n)  z~n  =  log  [XU)].  (10) 

n=-oo 

Since  log  [X(z)]  must  be  a  z-transform,  it  must  have  the  properties  of  a  z-transform. 
In  particular,  we  must  be  able  to  define  a  region  (actually  a  Riemann  surface)  in  which 
log  [X(z)]  is  single-valued  and  analytic  and  possesses  a  Laurent  series  expansion.  Thus 
before  proceeding  to  the  actual  definition  and  discussion  of  the  realization  of  the  sys¬ 
tem  D,  it  is  first  appropriate  to  review  some  of  the  properties  of  the  complex 
logarithm. 

2.  1  COMPLEX  LOGARITHM 

The  function  X(z)  can  be  expressed  as 

X(z)  =  |X(z)  |  eJ  ar®  [X{Z)\ 

The  logarithm  of  X(z)  is  defined  as 

log  [X(z)]  =  log  |  X(z)  |  +  j  arg  [X(z)].  (11) 

since  e^2wt*  =  1  for  any  positive  or  negative  integer  q,  it  is  clear  that  we  may  always 
write  arg  [X(z)]  as 

arg  [X(z)J  =  ARG  [X(z)]  ±  j2nq, 

where  q  =  0,  1, 2 . and 

-ir  <  ARG  [X(z)]  n. 

Therefore  log  [X(z)j  may  be  expressed  as 

log  [X(z)J  =  log  jX(z)J  +  j  ARG  [X(z)]  ±  j2nq.  (12) 

That  is,  the  complex  logarithm  is  multivalued,  with  infinitely  many  possible  values.  The 
principal  value  of  log  [X(z)]  is  defined  as  the  value  of  Eq.  12  when  q  =  0,  and  ARG  [X(z)] 
is  called  the  principal  value  of  arg  [X(z)].  (Henceforth,  the  principal  value  of  an  angle 
will  be  denoted  by  capital  letters.) 

The  transformation  D  must  be  unique.  Therefore  the  logarithm  must  be  so  defined 
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that  there  is  no  ambiguity  with  respect  to  its  imaginary  part.  Furthermore,  we  require 
that  log  [X(z)j  be  analytic  in  some  annular  region  of  the  z  plane  because  the  values  of 
the  complex  cepstrum  x  are  defined  as 


*  nr  £  l0*  [X(Z*"~‘  d~- 


In  Eq.  13,  C  is  a  circular  contour  specified  by 


z  =  eff+jw 


-it  <  w  ^  ir, 


where  eff  is  the  radius  of  the  circle.  In  Eq.  13  it  is  assumed  that  log  [X(z)]  has  a 
Laurent  series  expansion  as  in  (10).  Thus  we  must  insure  that  log  [X(eff+^w)]  is  analytic 
in  an  annular  region  containing  the  circle  with  radius  ec.  This  region  is  appropriately 
called  the  region  of  convergence  of  log  [X(z)J  or  of  X(z). 

In  general,  the  principal  value  of  the  phase,  ARG  [X(eff+^w)]  will  be  a  discontinuous 
function  of  «.  In  fact,  ARG  [X(et7+^0J)  ]  will  be  discontinuous  for  values  of  w  for  which 


arg  [X(eff+jw)]  =  mr, 


n  =  ±1 ,  +3,  +5, . 


A  typical  example  of  a  phase  curve  and  its  corresponding  principal  value  is  shown  in 
Fig.  5.  If  the  principal  value  of  the  phase  is  used  in  defining  the  complex  logarithm. 


arg  [X(«®+j“)J 


ARG  CX(.0+iu)] 


Fig.  5.  (a)  Typical  phase  curve  for  a  z -transform  evaluated  on  a 
circular  contour  about  z  =  0. 

(b)  The  principal  value  of  the  phase  curve  in  (a). 
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its  derivative  does  not  exist  at  the  points  of  discontinuity  of  ARG  [X(eff+^w)].  Therefore 
the  function  log  [X(ea+^w)]  would  fail  to  be  analytic  at  these  points.  Because  log  [X(z)J 
must  be  analytic  on  the  contour  C,  we  must  eliminate  such  singular  behavior  by  com¬ 
puting  a  phase  curve  with  no  discontinuities. 

We  also  recall  that  if 

X(z)  =  Xj(z)  X2(z), 

then  we  require  that 

log  (X(z)]  =  log  [Xj(z)]  t  log  [X2(z)], 

on  the  contour  C . 

If  we  write 

X^z)  =  (X1(z)  |  ei  arg  ^X(z)^ 

and 

X2(z)  =  |X2(z)|  ej  arg  [X(2)l 
then  we  require  that 

log  | X(z)  |  =  log  (Xjfz)  |  +  log  |X2(z)|  <14) 

and 

arg  [X(z)]  =  arg  [Xj(z)]  +  arg  [X2(z)],  (15) 

where  z  =  ea+^w  and  -v  <  w  ^  ir.  Since  log  |  X(z)  |  is  simply  the  logarithm  of  a  positive 
real  number,  (14)  will  be  satisfied  whenever  |Xj(z)|  and  |X2(z)|  are  nonzero  and  finite. 
With  respect  to  the  phase  angles,  we  can  write 


arg  [X(z)]  =  ARG  [X(z)J  ±  j2irq 

(16a) 

arg  [Xj(z)]  =  ARG  [Xj(z)J  ±  jZnqj 

(16b) 

arg  [X2(z)]  =  ARG  [X2(z)]  ±  j2irq2, 

(16c) 

where  q,  qj,  and  q2  are  integers.  Clearly,  (15)  will  hold  only  if  we  choose  the  appro¬ 
priate  value  for  arg  [X(z)].  For  example,  suppose  that  we  choose  the  principal  value 
for  all  angles.  It  can  be  shown  that,  in  general, 

ARG  [X(z)3  *  ARG  [Xj(z)]  +  ARG  [X2(z)j. 

One  way  of  insuring  that  (15)  will  always  hold  is  to  assume  that  all  angles  are  computed 
so  that  they  are  continuous  functions  of  w  as  z  varies  along  the  contour  C  specified  by 
z  =  e0^10.  This  implies  that  for  each  value  of  w,  we  have  chosen  the  proper  values  for 
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q,  qj,  and  q2  in  Eqs,  16  so  that  all  angles  are  continuous  functions  of  w.  In  the  actual 
computation  we  only  compute  arg  [X(z)],  so  the  proper  choice  of  q}  and  q2  is  implicit 
in  the  proper  choice  of  q.  Thus  requiring  that  the  phase  curve  be  continuous  also 
implies  that  Eq.  15  is  satisfied. 

Two  other  restrictions  on  the  form  of  arg  [X(z)]  result  from  considerations  that  do 
not  have  to  do  with  the  logarithmic  operation.  If  we  require  that  x(n)  be  real  when  x(n) 
is  real,  the  real  part  of  X(eff+^°)  must  be  an  even  function  of  w  and  the  imaginary  part 
of  X(eff+^w)  must  be  an  odd  function  of  w.  Since  |X(eor+^w)|  is  even  for  real  x(n),  so  is 

Re  [X(eff+jw)]  =  log  |X(eff+iw)|. 

The  requirement  on  the  imaginary  part  implies  that  we  must  define 
arg  [X{eff+-*w)]  =  -arg  [X(e<T“jw)]. 


A  final  condition  is  required  because  log  [X(z)]  is  to  be  the  z -transform  of  the  sequence 
x;  log  [X(eff+^w)]  must  be  periodic  in  w  with  period  2ir.  That  is, 

log  |X(eff+jw)|  =  log  |x(eff+jw±j2trk)f 

and 

arg  [X(e0+jw)]  =  arg  [X(eff+jw±;i2,rk)], 


It  where  k  =  0,  1,2,  ....  This  periodicity  and  the  even  and  odd  symmetry  properties 

imply  that  log  |X(e<r+J<‘))  |  has  even  symmetry  about  w  *  0,  ±ir,  ±2ir . and  likewise 

;  arg  [X(eff+Ja>)]  has  odd  symmetry  about  w  =  0,  ±tr,  ±2ir,  .... 

I  To  summarize,  the  conditions  that  are  imposed  on 

j  Im[X(z)]=  arg[X(z)] 

;  are  the  following. 

>  (Cl)  arg  [X(z)]  is  a  continuous  function  of  w  for  z  =  e0+^w. 

r 

'  (C2)  arg  [X(z)]  is  an  odd  function  of  w  for  z  =  e0+^w. 

i  (C3)  arg  [X(z)]  is  periodic  in  w,  with  period  2ir  for  z  =  eff+^w. 

Conditions  similar  to  (C2)  and  (C3)  apply  to  log  }X(z)j  and  follow  automatically  from 
the  definition  of  the  logarithm  of  a  real  number  and  the  symmetry  properties  of  the  mag¬ 
nitude  of  a  z-transform.  These  conditions  are  the  following. 

(C5)  log  |  X(z)  |  an  even  function  of  u>  for  z  =  ea+^w. 

(C6)  log  |X(z)j  is  periodic  in  u>,  with  period  2w  for  z  =  eff+^w. 
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2.  2  REALIZATIONS  FOR  THE  SYSTEMS  D  AND  D 


-1 


We  have  seen  that  if  special  care  is  taken  in  defining  the  complex  logarithm,  the 
logarithm  of  a  product  of  z-transforms  is  the  sum  of  the  logarithms.  Furthermore, 
under  these  conditions,  log  [X(z)]  can  also  be  thought  of  as  the  z-transform  of  the 


D 

Fig.  6.  Realization  of  the  characteristic  system  for  homomorphic 
deconvolution  using  z-transforms. 


complex  cepstrum.  Thus,  one  realization  of  the  system  D  is  that  shown  in  Fig.  6. 
The  complex  cepstrum  is  seen  to  be  the  result  of  the  equations 


00 


X(z)  =  )  x(n)  z"n 

(17a) 

n=-oo 

X(z)  =  log  [X(z)] 

(17b) 

x(n)  =  §  lo8  [X(z)3  zn_1  dz, 

(17c) 

where  the  closed  contour  C  lies  in  a  region  in  which  log  [X(z)]  has  been  defined  as 
single -valued  and  analytic. 


I - 1 


Fig.  7.  Realization  of  the  inverse  characteristic  system  for  homomorphic 
deconvolution  using  the  z-transform. 


Similarly  the  inverse  of  the  system  D  is  shown  in  Fig.  7.  Thus,  we  obtain  for  the 
output  of  D_1  the  equations 
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^  y(n)  z_n 


Y(z)  =  exp[Y(z)] 


2irj  Jp, 


Y(z)  zn_1  dz. 


In  (18c)  the  contour  C'  must  be  a  closed  contour  in  the  region  of  convergence  of  the  input 
z-transform  X(z).  This  is  required  because  if  the  linear  system  is  the  identity  system, 
we  require  that  the  over-all  system  be  the  identity  system;  that  is,  if 

y(n)  =  x(n). 


y(n)  =  x(n). 


2.  3  INTEGRAL  RELATIONS  FOR  THE  COMPLEX  CEPSTRIJM 


We  '.lave  shown  that  the  complex  cepstrum  can  be  obtained  from  the  set  of  equations 


wu 

•l 


x(n)  z 


X(z)  =  log  [X(z)]  =  log  |X(z)  |  +  j  arg  [X(z)J  (19b) 

x(n)  =-^j  §  X(z)  zn_1  dz.  (19c) 

These  equations  constitute  a  definition  of  the  system  D  and  also  lead  to  a  computational 
realization.  We  shall  consider  Eq.  19c  and  show  how  it  may  be  used  in  studying  the  prop¬ 
erties  of  the  system  D. 

We  have  seen  that  the  circular  contour  C  must  lie  in  the  region  of  convergence  of 

A 

X(z).  In  using  these  equations  for  computation,  part  of  the  definition  of  the  system  D 

A 

is  the  choice  of  the  region  of  convergence  of  X(z)  In  general,  the  two-sided  transforms 

a  23  26 

X(z)  and  X(z)  have  regions  of  convergence  which  are  annular  regions  of  the  z  plane.  ’ 

For  example,  we  shall  usually  denote  the  region  of  convergence  by  a  relation  of  the  form 
R+  <  |*|  <R_. 

By  definition,  these  regions  can  contain  no  singularities  of  the  z-transforms.  The  regions 

of  convergence  may,  however,  contain  zeros  of  the  z-transforms,  and  we  shall  see  that 

A  a 

these  cases  require  special  handling.  Since  X(z)  is  the  complex  logarithm  of  X(z),  X(z) 
will  have  singularities  at  all  of  the  singularities  and  at  all  of  the  zeros  of  X(z).  Similarly, 


A  40 

X(z)  will  have  zeros  at  all  of  the  ones  (X(z)  =  eJ  )  of  X(z).  Therefore  we  see  that  the 

A 

region  of  convergence  of  X(z)  can  be  the  same  ac  the  region  of  convergence  of  X(z)  only 
if  X(z)  has  no  zeros  in  its  region  of  convergence.  On  the  other  hand,  it  should  be  clear 
that  we  are  free  to  choose  any  annular  region  that  does  not  contain  singularities  or  zeros 

A 

of  X(z)  as  the  region  of  convergence  of  X(z). 

A 

The  choice  of  the  region  of  convergence  for  X(z)  is  based  primarily  on  computational 
considerations,  and  at  least  two  different  choices  have  been  found  useful.  In  any  case, 
it  should  be  clear  that  for  a  given  input  sequence  x,  it  is  possible  to  obtain  many  dif¬ 
ferent  output  sequences  x  depending  on  the  region  of  convergence  that  is  chosen  for 

A 

X(z).  This  does  not  mean  that  the  output  is  not  unique  because  the  choice  of  the  contour  C 

A 

(and  therefore  the  choice  of  the  region  of  convergence  of  X(z))  is  part  of  the  definition 
of  the  characteristic  system  O.  Once  this  contour  is  fixed,  the  output  is  uniquely  deter¬ 
mined. 

Let  us  temporarily  leave  the  contour  C  unspecified  and  obtain  a  more  useful  expres¬ 
sion  for  x(n).  Using  Eqs.  19b  and  19c,  we  obtain 


(20) 


If  we  note  that  the  contour  C  is  specified  by  z  =  e0+3w,  with  -ir  <  w  <  ir,  we  can 
write  (20) 


X(n)  *  f *  log  [x(e<r+jw)]  effn  «>n  dw.  (21) 

— tr 

We  shall  proceed  to  integrate  (21)  by  parts  under  the  assumption  that  log  [X(eff+^w)] 
is  a  single-valued  periodic  function  of  w  which  is  everywhere  continuous.  We  shall  find 
that  this  assumption  is  somewhat  restrictive,  but  we  shall  also  show  how  the  results 
derived  here  apply  to  more  general  circumstances. 

If  we  integrate  (21)  by  parts,  we  obtain,  for  n  *  0, 


Sw "  efe  -  jk  f  -£r‘°e  °"n  ■i"n 

•IT 


Because  both  log  [X(eff+^w)]  and  eJun  are  periodic  with  period  2rr,  the  first  term  in  the 
expression  above  vanishes.  Since  we  have  assumed  that  log  (X(eff+^w)]  is  continuous 
everywhere,  we  obtain 


x(n)  = 


2w]n 


1  f*  dw 

J 


X(eff+h 


X(eff+^) 


-  ecn  eJwn  dw. 


(22) 


Since  the  logarithmic  derivative  is  also  analytic  in  the  region  of  convergence  of  X(z)  we 
may  write  (22) 


zX'(z) 

X(z) 


_n-l 


dz, 


(23) 


where  the  prime  indicates  differentiation  with  respect  to  z.  Tha  contour  C  is,  of 

A 

course,  still  in  the  region  of  convergence  of  X(z). 

The  value  of  x  at  n  =  0  is  obtained  directly  from  Eq.  21;  that  is, 


x(0)  =  J_  |  ^  log  [X(eff+i“)]  du. 

“TT 


Since  arg  [X{eff+^w)]  is  an  odd  function  of  u  and  log  ( X(e<r+Jw)  |  is  an  even  function  of  w, 
*i0)=W§"  log  lx<e<r+j“)l  <**>•  (24) 


Thus  as  an  alternative  to  Eqs.  19  for  analysis,  and  possibly  for  computational  pur¬ 
poses,  we  have  Eqs.  23  and  24,  under  the  assumption  that  X(eff+^“)  is  a  single -valued 
and  continuous  function  of  «  for  all  w.  We  shall  see  that  this  condition  must  be  relaxed 
in  order  to  include  most  situations  of  interest.  (This  will  be  done  in  section  2.  9.) 

2.4  "TIME-DOMAIN"  EXPRESSIONS  FOR  THE  COMPLEX 
CEPSTRUM 


The  expressions  just  derived  gave  the  complex  cepatrum  x  explicitly  in  terms  of 
the  z -transform  of  x.  Equation  23  may  be  used  to  obtain  an  implicit  expression  in 
terms  of  x(n)  tnd  x(n)  which,  in  certain  cases,  reduces  to  a  recursion  formula. 

This  implicit  relation  may  be  derived  as  follows.  If  we  assume  that  log  [X(eff+^W)J 
is  continuous  for  all  «,  we  can  write 

/v  X'(z) 

X'(z)  = - . 

X(z) 

Rearranging  this  expression,  we  obtain 

zX'(z)  =  zX'(z)  X(z).  (25) 

Since 


oo 

X'(z)  =  z-1  ^  -  nx(n)  z“n, 

n=-oo 


we  see  that  the  inverse  z-transform  of  (25)  is 


nx(n)  = 


00 


y  kx(k) 


k=-oo 


x(n-k). 


(26) 
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There  are  several  special  cases  of  (26)  that  are  worthy  of  special  consideration. 


Case  1:  x(n)  =  0  for  n  <  0,  and  x(0)  *  0. 
In  this  case  we  can  write 


nx(n) 


U 

■l 


kx(k)  x(n-k), 


k=-« 

which  can  be  written  as 
n-1 


x(n)  = 


x(n) 

x(0) 


kx(k) 


k=-‘v 


x(n-k) 

x(0) 


n  *  0. 


(27) 


Thus  we  see  that  x(n)  depends  on  all  values  of  x  and  the  values  of  x  for  k  <  n. 

Case  2;  Suppose  that  x{0)  *  0  and  x(n)  =  0  for  n  <  0.  If  we  further  assume  that  x(n)  =  0 
for  n  <  0,  we  obtain  froxr  Eq.  27 


x(n) 


x(n)  "f-1 


x(0) 


L 

k=0 


^  x(n-k) 


n  >  0. 


x(0) 


(28) 


The  value  of  x(0)  for  sequences  of  this  type  can  be  shown  to  be  (r<jf  section  2.  5) 

x(0)  =  log  x(0).  (29) 

Requiring  that  x(n)  =  0  for  n<0  is  equivalent  to  choosing  the  contour  C  in  (23)  so  as  to 
enclose  all  of  the  poles  and  zeros  of  X(z).  If  X(z)  has  poles  or  zeros  outside  the  unit 

ft  q 

circle,  it  can  be  shown  ^  that  x(n)  will  be  unbounded  for  large  n,  since  we  are  effec¬ 
tively  choosing  the  region  of  convergence  to  be  outside  of  all  of  the  poles  and  zeros  of 
X(z).  This  will  not  be  the  case,  however,  if  X(z)  has  all  of  its  poles  and  zeros  inside 
the  unit  circle. 

Thus  when  X(z)  has  all  of  its  poles  and  zeros  inside  the  unit  circle,  x(n)  satisfies  a 
recursion  relation  that  could  be  us^d  in  actually  computing  x(n).  (Discussion  of  the  util¬ 
ity  of  this  expression  is  reserved  for  Section  III.) 

Finally,  we  observe  that  Eqs.  28  and  29  provide  a  way  of  obtaining  x  from  x,  that 
is,  a  recursive  relation  for  the  inverse  characteristic  system.  By  rearranging  Eqs.  28 
and  29,  we  obtain 


x(0)  =  e 


x(0) 


n-1 


x(n)  =  x(n)  x(0)  + 


la) 


x(k)  x(n-k) 


n  >  0. 


(30a) 


(30b) 


k=0 


Equations  30  represent  a  realization  of  the  inverse  characteristic  system  for  sequences 
whose  z-transforms  have  no  poles  and  zeros  outside  the  unit  circle. 
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Case  3:  Suppose  that  x\0)  *  0,  x(n)  =  0  for  n  <  0,  and  x(n)  =  0  for  n  <  0  and  n  >  M. 
In  this  case,  Eqs.  28  and  29  take  the  form 

x(Q)  =  log  x(0) 


2(n)  « 


- 1  © 


x(n-k) 


0  <  n  $  M 


U-i 

-  I  ©' 


x(n-k) 


k=n-M 


n  >  M. 


Case  4:  Let  x(n)  =  x(n>  =  0  for  n  >  0  and  x(0)  #  0, 

These  assumptions  are  equivalent  to  taking  the  contour  C  in  (23)  to  be  inside  all  of 

the  poles  and  zeros  of  X(z).  Thus  for  a  stable  sequence  x,  we  require  that  all  of  the 

21 

poles  and  zeros  be  outside  the  unit  circle.  Using  Eq.  26,  we  arrive  at 

x(0)  =  log  x(0)  (32a) 


x(n)  =- 


V 

-  I  © 


x(k)  x(n-k) 


k=n+l 


n  <  0. 


2,5  COMPLEX  CEPSTRUM  FOR  SEQUENCES  WITH  RATIONAL 
z -TRANSFORMS 

In  actual  computations,  we  are  always  restricted  to  sequences  of  finite  length  and 
hence  to  z-transforms  that  are  simply  polynomials  in  z  .  Thu  ;  it  is  not  a  significant 
restriction  if  we  consider  z-transforms  of  the  form 


X(z)  =  A 


m.  mQ 

n  (l-a.  z-1 )  n  (l-b,z) 
k=l  v  K  '  k=l  K 

Pi  P0 

n  (i-c^z"1)  n  d-d.z) 

k=l  V  K  '  k=l  K 


where  A  is  a  positive -real  constant,  and  the  a^,  b^,  c^  and  d^  are  nonzero  complex 
numbers  whose  magnitudes  are  less  than  one.  If  x  is  a  real  sequence,  then  the  a^,  b^, 
ck  and  d^  occur  in  complex  conjugate  pairs.  Careful  examination  of  Eq.  33  shows  that 
there  are  nu  zeros  and  p^  poles  inside  the  unit  circle,  and  mo  zeros  and  pQ  poles  out¬ 
side  the  unit  circle.  Clearly,  (33)  is  not  the  most  general  rational  z~transform,  since 

T 

A  could  be  negative  and  in  general  we  must  include  a  factor  of  the  form  z  to  account 
for  all  shifted  versions  of  the  sequence  x.  Since  our  method  in  computation  will  be  to 
deal  with  these  issues  separately,  we  shall  defer  discussion  of  these  points. 


We  have  shown  that  the  phase  curve  must  be  a  continuous  function  of  w.  Since 
arg  (X(e0+^w)]  will  s.e  the  sum  of  the  arguments  of  each  multiplicative  factor  in  Eq.  33, 
it  is  helpful  to  consider  the  contributions  from  each  of  these  factors.  Figures  8  and  9 
show  the  typical  pole-zero  plots  and  one  period  of  phase  curves  when  z  =  eJW  for  each 
type  of  numerator  factor  in  Eq.  33.  The  corresponding  denominator  factors  produce 
phase  curves  that  are  the  same  except  for  sign.  In  all  cases,  the  peak  value  of  these 
phase  curves  is  less  than  or  equal  to  ir/2.  The  value  ir/2  is  attained  only  when  the  zeros 
(or  poles)  lie  on  the  unit  circle.  If  the  zeros  (or  poles)  are  on  the  unit  circle,  the  phase 
curves  become  discontinuous.  We  also  observe  from  Figs.  8  and  9  that  all  of  the  phase 
curves  of  these  factors  are  zero  at  w  =  0,  ±ir,  ±2rr . 

Since  the  total  phase  curve  for  Eq.  33  is  the  sum  of  the  phase  curves  of  each  factor, 
the  total  phase  curve  wil'.  be  zero  at  co  =  0,  fir,  ±2ir,  ....  Furthermore,  it  is  clear  that 
arg  [X«eff+^w)]  will  in  general  be  greater  in  magnitude  than  it.  Therefore  in  computing 
the  phase,  we  must  use  an  algorithm  that  enables  us  to  determine  the  correct  phase 
curve,  that  is,  one  without  discontinuities. 

One  such  algorithm  computes  the  principal  value  of  the  phase  and  then  determines 
the  correct  multiple  of  2ir  to  add  to  or  subtract  from  the  principal  value  for  each  value 
of  <d.  This  algorithm  is  discussed  in  Section  III. 

We  are  also  interested  in  log  |X(ea+^w)  | ,  since  this  is  the  real  part  of  the  complex 
logarithm.  Since  the  magnitude  is  an  even  function  of  w,  it  will  have  the  same  general 
form  for  poles  and  zeros  both  inside  and  outside  the  unit  circle.  Let  us  consider  a  fac- 

jiji 

tor  such  as  (l  -zoz-1 )  for  z  =  e^w,  and  zQ  =  |zq  j  e  °.  The  magnitude  of  such  a  factor 
is 

l1_z0e"jwl =  [1  +  Iz0I2"2Iz0I cos<u-v]  • 

Taking  the  logarithm,  we  obtain 

log  |l-zoe“jw|  =  ylog  [\  +  |zJ2-2|zo|  cos(«-<J>o)j. 

This  function  is  sketched  in  Fig.  10.  We  note  that  it  is  periodic  with  period  2w.  The 
maximum  positive  value  is  4- log  ( 1  +  2 1 z  |  4  |z  |2)  which  approaches  log  (2)  as  Jz  | 
approaches  1.  Similarly,  the  most  negative  value  is  y  log  U  -2  jzQ|  +  |zq|  )  which 
approaches  log  (0)  or  -«o  as  |zQj  approaches  1. 

Since  log  |X(e^°)j  is  the  sum  of  terms  such  as  this  (with  negative  signs  for  denom¬ 
inator  factors),  we  would  expect  that  log  |X(e^w)  |  would  have  an  appearance  similar  to 
that  of  Fig.  10,  except  that  in  general  there  will  be  peaks  corresponding  to  each  of  the 
poles  and  zeros  of  X(e^°).  A  typical  example  of  log  |X(e^w)  j  is  showr  in  Fig.  1 1 . 

We  have  seen  that  z -transforms  having  the  form  of  Eq.  33  satisfy  the  requirement 
that  log  [X(eff+^w)]  be  continuous  everywhere.  Thus  we  may  employ  Eq.  23  to  evaluate 
the  complex  cepstrum.  The  integrand  zX'(z)/X(z),  in  this  case,  is 
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i 


i  o  ° 


1  “bkz 


k- 1  1  "  V 


V  dk 

^r+Z 


Since 


r  zX'(z)  . 

x(n)  =  2irin  T  -  Z  dz* 

2wjn  Jc  X(2) 

we  see  that  if  we  desire  a  stable  sequence  (one  whose  values  approach  zero  for  large  n), 
we  must  choose  the  region  of  convergence  to  include  the  unit  circle.  Each  factor  in 
(34)  is  the  z-transform  of  an  exponential  sequence.  Therefore,  if  the  contour  is  taken 
as  the  unit  circle,  x(n)  is  given  by 

Pi  n  mi  n 

-  1  T  -  I  TT  nsl  |35a| 


P°  . -n  mo  j-n 

.  y  h..  y  is_ 

L  n  n 


n  «  -1 . 


The  value  of  x(0)  is  obtained  from  Eq.  24  with  a  -  0.  Therefore 

x(0)  =  -^  J*  log  |X(eju)|  dw. 

-ir 

Each  factor  of  |  X(e^)  |  has  tne  form 

1 1  —  a  e^w|  =  1  +  |  a  | 2  —  2 1  a  |  cos  (ul  arg[a]), 
and  it  can  be  shown  that 


log  (1  +  |  a [ 2  —  2 1  a.  j  cos  (u?  arg  [a]))  dw  =  0, 


if  a  ^  1.  Therefore  we  see  that 


i  r 

=  27  J  log  A  dw  =  log  A. 


Equations  35  and  36  express  x(n)  in  terms  of  the  poles  and  zeros  of  the  rational 
z-transform.  They  also  illustrate  an  important  property  of  the  complex  cepstrum.  It 
is  clear  from  Eqs.  35  that 


|x(n)  |  «  B 


n^O, 


where  B  is  a  positive  constant  and  a  is  the  magnitude  of  the  pole  or  zero  that  is  closest 
to  the  unit  circle. 

In  many  simple  cases,  it  is  not  necessary  or  desirable  to  use  the  integral  formulas 
tor  purposes  of  analysis.  This  is  particularly  true  when  the  z -transform  is  a  rational 
function.  In  this  case  a  power-series  expansion  of  log  [X(z)]  is  usually  more  convenient. 

Under  the  assumptions  that  log  [X(z)]  is  defined  to  be  single-valued  and  analytic  in 
the  region  of  convergence,  and  that  X(z)  has  the  form  of  Eq.  33,  we  may  write 


A 

log  [X(z)]  =  log  A  +  ^  lc8  (1-akz  l  ) 
k=l 

mo  Pi  Po 

+  ^  log  (l-bkz)  -  ^  log  ^l-ckz-1)  -  ^  log  (1-dj^z). 


Since  we  define  X(z)  as  a  z-transform,  it  must  be  true  that 


=  log  [X(z)]  =  2  z~" 


Thus  we  immediately  see  that 
x(0)  =  log  A. 

If  we  effectively  take  the  contour  C  to  be  the  unit  circle,  then  each  of  the  remaining 
terms  in  (37)  can  be  expanded  in  a  Laurent  series  about  z-  0.  For  example,  we  can  write 


log  (l-akz"1)=  -  £ 


00  n 


for  |  z  |  >  laj 


oo  n 
t  Ci. 


-log(l-V-‘)= 


for  |z 


-1  -n 


log  (1 


-bkz)=  ^  z~”  for  |z|  <  (b"1  J 


-!  d"n 

-log  (1-c^z)  =  -  2  "rz"n  for  |z|  <  (d^1 1. 
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Therefore  if  we  add  these  convergent  series  and  collect  the  coefficients  of  z-n,  we  can 
determine  x(n).  In  general,  we  see  that  x(n)  can  be  written 


x(0)  =  log  A 


£(n) 


Pi  mi  an 

vjc  y  \ 
“  L  n  '  L  n 


k=l 


k=l 


n  s*  1 


mo  . -n  Po  ,-n 

.  s;  \  \ 

Lj  n  n 


k=l 


k=! 


n  «  -1, 


(39a) 


(39b) 


(39c) 


Equations  39  agree  with  Eqs.  35  and  36,  as  we  would  expect.  The  real  value  of  the 
power-series  approach  is  best  illustrated  by  our  use  of  it  in  discussing  echo  removal 
applications  in  Section  V. 


2.6  MINIMUM-PHASE  AND  MAXIMUM-PHASE  SEQUENCES 


We  have  considered  a  realization  of  the  system  D  which  was  based  on  the  z- 
transform.  In  some  cases  it  is  possible  to  take  advantage  of  the  properties  of  the  z- 
transform  to  obtain  simplified  results.  For  example,  we  have  seen  (section  2.4)  that 
under  certain  conditions,  x(n)  obeys  a  recursion  formula.  We  shall  now  consider  these 
cases  in  detail  and  present  an  alternative  computation  scheme. 

A  minimum  -  phase  sequence  is  defined  as  a  sequence  whose  z -transform  has  no  poles 
or  zeros  outside  the  unit  circle.  Furthermore,  the  region  of  convergence  for  the 
z-transform  includes  the  unit  circle.  For  rational  z -transforms,  X(z)  is  of  the  form 


X(z)  =  A 


where  the  a^  and  c^  are  complex  numbers  whose  magnitudes  are  less  than  one,  and  the 
region  of  convergence  is  specified  by 


|z|  >  max  |c  j. 
k  K 


Such  sequences  have  the  properties 


n  <  0 


x(n)  =  0 
x(0)  *  0 
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(40a) 

(40b) 


1 


«w 

^  |  x(n)  j  <  oo. 


(40c) 


n=0 


We  have  shown  that  the  complex  cepstrum  of  such  a  sequence  has  the  properties 

n  <  0  (41a) 


x(n)  =  0 


x(0)  =  log  x(0) 


vw 

I  |i(n)| 


<  00. 


(41b) 


(41  c) 


n=0 


Since  Eqs.  40  and  41  are  necessary  and  sufficient  conditions,  these  equations  could  be 
taken  as  the  definition  of  a  minimum- phase  sequence. 

An  entirely  analogous  situation  is  called  maximum-phase.  In  this  case,  X(z)  has 
all  poles  and  zeros  outside  of  the  unit  circle,  and  the  region  of  convergence  includes 
the  unit  circle.  In  this  case,  x  has  the  properties 


x(n)  =  0 


x{0)  #  0 


n  >  0 


£  Ml 


<  00. 


(42a) 


(42b) 


(42c) 


n=-oo 


x(0)  =  0 


Similarly,  the  complex  cepstrum  has  the  properties 
n  >  0 

x(0)  =  log  x(0) 

0 


2,  l*(n)  I 


< 


(43a) 

(43b) 

(43c) 


D~-oo 


For  rational  z-transforms,  X(z)  has  the  form 


m. 


X(Z)  r 


B  n  (1-b.z) 
k=l  K 


n  (i-dkz) 
k=l  K 


where  the  and  d^  are  all  less  than  one  in  magnitude,  and  the  region  of  convergence  is 
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jzj  <  min  |dT*  j . 
k  K 

In  general,  there  may  be  poles  and/or  zeros  on  the  unit  circle.  These  cases  will 
be  formally  excluded  from  either  class;  however,  we  shall  see  (section  2.  7)  that  it  xs 
possible  to  move  such  poles  and  zeros  inside  or  outside  the  unit  circle  by  exponential 
weighting  of  the  sequence. 

If  the  input  sequence  is  known  to  be  minimum-phase,  we  can  obtain  significant  sim¬ 
plifications  in  our  results.  We  have  already  seen  that  the  characteristic  system  D  and 
its  inverse  can  be  realized  through  a  recursion  formula.  We  now  wish  to  show  that  the 
properties  of  minimum-phase  sequences  allow  other  simplifications  in  the  computation 
of  the  complex  cepstrum. 

Let  us  introduce  some  definitions.  We  define  the  even  part  of  a  sequence  to  be  the 
sequence  whose  values  are 

x(n)  +  x(-n) 

Ev  [x(n>]  = - - - .  (44) 

The  sequence  Ev  [x(n)]  is  seen  to  have  even  symmetry;  that  is, 

Ev  [x(n)J  =  Ev  [x(-n)]. 

Similarly,  we  define  the  odd  part  of  a  sequence  as 
x(n)  -  x(-n) 

Odd  [x(n)J  - - g - ,  (45) 

which  has  odd  symmetry;  that  is, 

Odd  [x(n) ]  =  -Odd  [x(-n)J. 

It  can  be  shown  that  if  x  is  a  real  sequence  and 
X(ejw)  =  Xr(eiw)  +  jX.(e^), 

then  Xr(eJW)  is  the  transform  of  Ev  [x(n)]  and  similarly,  jX^(e^w)  is  the  transform  of 
Odd  [x(n)J. 

Let  us  assume  that  x(n)  =  0  for  n  <  0.  In  this  case  we  can  see  from  Eq.  44  that 

x(n)  =  2  Ev  [x(n)J  n  >  0 
=  Ev  [x(n)J  n  =  0 

=  0  n  <  0. 

That  is,  knowledge  of  the  even  part  of  a  sequence  that  is  zero  for  n  <  0  is  sufficient  to 
determine  the  entire  sequence. 

These  properties  represent  a  real  part  sufficiency  theorem  for  z-transforms  of 
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I 


1 


ii 
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sequences  that  are  zero  for  n  <  0.  For  example,  suppose  we  are  given  the  real  part 
Xr(e^w)  of  the  transform  of  a  sequence  x.  Since  Xr(e^w)  is  the  transform  of  the  even 
part  of  the  sequence,  we  can  determine  Ev  [x(n)]  from  Xy(e^w).  If  x(n)  =  0  for  n  <  0,  we 
can  determine  the  sequence  x  and  therefore  X(e^w).  Thus  knowledge  of  Xr(e^w)  is  suf¬ 
ficient  to  completely  determine  X(e^w). 

Similar  relations  hold  between  the  logarithm  of  the  magnitude  and  the  phase  of  a 
z-transform,  but  under  more  restricted  conditions.  Rather  than  focus  our  attention  on 
relations  between  the  magnitude  and  phase,  let  us  consider  the  complex  cepstrum  first 
and  return  to  this  question  eventually.  If  x(n)  =  0  for  n  <  0,  we  can  apply  the  previous 
results  to  the  complex  cepstrum  to  obtain 


x(n)  =  2  Ev  [x(n>] 

n  >  0 

(46a) 

=  Ev  [x(n)] 

n  =  0 

(46b) 

=  0 

n  <  0. 

(46c) 

Since  the  real  part  of  X(e^w)  is  just 
Xr(ejw)  =  log  |X(ejw)|, 
we  see  that 

Ev  Cac(n) ]  =  ±  log  |X(ejw)  |  ejwn  d*. 

—IT 

Thus,  if  x(n)  =  0  for  n  <  0,  we  only  need  to  compute 
Xr(ejw)  =  log  |X(ejw)|, 
and  we  do  not  need  to  compute  the  phase. 

If  we  wish  to  obtain  the  original  sequence  from  log  |  X(e^w)  | ,  we  can  do  so  by  first 
computing  Ev  [x(n)],  then  x  by  using  Eqs.  46,  and  then  obtain  the  original  sequence  by 
using  the  inverse  characteristic  system  to  compute  x.  Since  the  condition  x(n)  =  0  for 
n  <  0  has  been  shown  to  be  equivalent  to  the  condition  that  x  be  minimum-phase,  we 
see  that  for  minimum-phase  sequences,  the  complex  cepstrum  can  be  computed  from 


00 


X(ejw)  =  )  x(n)  e-jwn 

(47) 

W 

n=0 

Ev  [x(n)]  log  |X(ejw)|  <r>wn  dw 

—IT 

(48) 

x(n)  =  Ev  [x(n)]  u(n). 

(49) 
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.-.were 

u{n)  =  2  n  >  0 

=  1  n  =  0 

=  0  n  <  0. 

Thus  the  characteristic  system  D  for  minimum- phase  sequences  can  be  realized  as 
shown  in  Fig.  12,  where  "Fourier  Transform"  means  X(e^w). 

Clearly,  the  discussion  above  also  indicates  a  unique  relation  between  log  |  X(eJt0)  | 


Fig.  i2.  A  realization  of  the  characteristic  system  D  for  minimum 
phase  sequences. 


and  arg  [X(e^u)J,  In  fact,  the  operations  illustrated  in  Fig.  12  are  equivalent  to  using 
the  Hilbert  transform24  to  obtain  the  proper  phase  curve  for  log  |X(e^w)|  when  X(e^°) 
is  minimum -phase. 

As  an  example  of  the  use  of  this  result,  let  us  consider  the  minimum-phase  sequence 

x(n)  =  an  n?0 

=  0  n  <  0. 

The  z -transform  of  this  sequence  is 

X(z)  = - - - 1-  for  )z|  >  | a ( . 

1  -  az 

and 

log  |X(ejw)j  =  --|-log  (l+a2-2acosu). 

Therefore  the  even  part  of  the  complex  cepstrum  is 

Ev(i(„>]=- log  (l+a?‘-2a  cosu)  e^  du 


which  can  be  written 


Ev  [x(n)j  =  - 


(1+a2 


2a  cos  w)  cos  wn  du. 
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At  this  point,  let  us  note  two  useful  equations  which  can  be  found  in  Carslaw. 


log  (1+a  -2acosu)  dw  =  0 


|  a  |  <  1 


=  it  log  &  a  >  1 


n  2  a* 

\  log  (1+a  -2acosw)  cos  wn  du  =  -it  — 
Jn  I . 


a|  «  1 


a  5-  1 . 


These  equations  hold  for  n  an  integer.  We  can  use  (50)  and  (51)  to  show  that 
Ev  [x(n)j  =0  n  =  0 


n  *  0, 


Therefore  we  see  that 


x(n)  =  2  Ev  [x(n)  ]  =  • 


n  «  0. 


In  conclusion,  we  wish  to  cs.!l  attention  to  an  interesting  representation  of  the  input 
sequence,  and  an  interesting  result  for  finite-length  sequences.  It  is  clear  from  the 
properties  of  minimum-phase  and  maximum-phase  sequences  that  every  sequence  x 
may  be  expressed  as 

x  =  xmin  ®  xma>:' 


where 


*min<n>  *  *  >  0 


*max(n)=x(n)  n<0- 


tor  rational  z -transforms,  this  is  equivalent  to 


X(z)  =  Xmin(z)  Xm3x(z), 


where 


K 


B 


i 


Xmin<z>  =  -IT1 


A  ",  O-V'1) 


",  O-V"1) 


k=l 


m 


n  (i-b.z) 

Xmaxtel  =  TT - 

n  (i-d.z) 

k=l  K 

The  results  given  here  have  particular  significance  for  finite-length  sequences. 
Suppose  that  X(z)  has  the  form 


mi  mQ 

X(z)  *  A  n  (l-a,z_1)  n  (1-b.z), 
k=l  '  K  '  k=l  * 


where  the  a^  and  b^  are  all  less  than  one  in  magnitude.  Clearly, 


and 


xmin<n)  *  0 


=  0 


xmax(n)  ^  0 


=  0 


0  «  n  <  m. 


elsewhere 


-m„  <  n  <  0 
o 


elsewhere. 


From  Eqs.  30  and  32,  we  obtain  the  relations 


Wn)  = e 


x(0) 


n  =  0 


n-1 


=  x{n)  x(0)  + 


2  (4)  x(k)  xmin<n-k) 


n  >0 


k=0 


xmax<n)  =  1 


n  =  0 
0 


=  x(n)+  1  (4)x(k)xmax(n-k>  n<0' 


k=n+l 


We  see,  therefore,  that  only  mQ  +  +  1  values  of  the  complex  cepstrum  are 

required  to  completely  '  etermine  the  mQ  +  m.  +  1  values  of  the  sequence  x.  This 


30 


i  « 


t 


result  implies  that  even  though  x  is  of  infinite  duration,  only  a  number  of  samples 
of  x  equal  to  the  length  of  the  input  sequence  x  is  required  to  completely  determine 
the  sequence  x  from  the  complex  cepstrum, 

2.7  EXPONENTIAL  WEIGHTING  OF  SEQUENCES 

Because  of  the  special  properties  of  minimum-phase  sequences,  it  is  of  interest 
to  consider  ways  of  obtaining  minimum-phase  sequences  from  nonminimum-phase 
sequences.  One  way  of  doing  this  is  to  weight  the  nonminimum-phase  sequence  with 
a  decaying  exponential.  By  this  we  mean  multiplication  of  the  values  of  a  sequence  by 
c11  to  obtain  a  new  sequence  whose  values  are 

w(n)  s  a\(n). 

There  are  two  important  points  to  consider.  First,  we  shall  consider  the  effect  of  expo¬ 
nential  weighting  on  a  convolution,  and  then  the  effect,  on  the  z-transform  and  its  region 
of  convergence. 

Suppose  that  x(n)  is  given  by 

00 

x(n)  =  ^  Xj(k)  x2(n-k). 

k=-oo 


For  the  exponentially  weighted  sequence,  we  obtain 


=  a11  ^  Xj[(k)  x2(n-k)  =  ^  akXj(k)  an"kx2(n-k) 


k=-co 


k=-<» 


=  ^  Wj(k)  w2(n-k). 


Therefore  exponential  weighting  of  a  convolution  of  two  sequences  x}  and  x2  is  seen  to 
be  equivalent  to  the  convolution  of  the  exponentially  weighted  sequences  w  j  and  w2  whose 
values  are 

Wj(n)  =  «nXj  (n) 
w2(n)  =  anw2(n). 

The  second  interesting  point  is  the  effect  of  exponential  weighting  on  the  z-transform 
of  a  sequence.  The  z-transform  of  the  weighted  sequence  w  is  given  by 

00 

W{z)  =  ^  onx(n)  z-  =  X(a-1z). 


,  ■£»'*?■■****•  '  - 


Therefore  we  see  that  if  X(z)  has  a  pole  or  zero  at  z  =  zQ,  then  W{z)  has  a  pole  or  zero 
at  azQ.  Thus  if  the  region  of  convergence  of  X(z)  is 

R+  <  |  z  |  <  R_, 

then  the  region  of  convergence  of  W(z)  is 

aR+  <  |z|  <  aR_. 

If  we  have  a  sequence  x  for  which  x(n)  =  0  for  n  <  0  but  is  nonminimum -phase,  then 
the  sequence  can  be  made  minimum-phase  by  appropriate  exponential  weighting.  We 
simply  need  to  multiply  x(n)  by  c11,  where  a  is  less  than  one  and  small  enough  to  move 
the  pole  or  zero  with  greatest  magnitude  inside  the  unit  circle. 

We  see,  then,  that  exponential  weighting  may  be  very  useful  because  convolutions 
are  preserved  and  it  permits  a  more  desirable  pole-zero  distribution.  We  should  point 
out,  however,  that  if  the  required  value  of  a  is  too  small,  we  shall  often  be  troubled 
with  rounding  errors  in  carrying  out  such  weighting  on  numbers  stored  in  a  computer. 

A  final  point  should  be  made.  Exponential  weighting  clearly  changes  the  complex 
cepstrum.  As  the  value  of  a  approaches  the  reciprocal  of  the  magnitude  of  the  pole  or 
zero  that  is  farthest  from  the  origin,  the  complex  cspstrum  becomes  zero  for  nega¬ 
tive  n.  If,  on  the  other  hand,  a  is  close  to  1 ,  it  will  not  significantly  affect  the  complex 
cepstrum,  unless  the  z-transform  of  the  input  has  poles  or  zeros  on  or  close  to  the  unit 
circle.  Since  convolutions  are  preserved  in  the  weighted  sequence,  the  complex  cep¬ 
strum  will  always  have  the  form 

w(n)  =  Wj(n)  +  w2/n). 

In  general,  there  is  not  a  simple  relationship  between  Wj(n)  and  Xj(n);  however,  if  the 
poles  and  zeros  of  Xj(z)  are  inside  the  unit  circle,  then  clearly  those  of  W.(z)  will  also 
be  inside  the  unit  circle  if  |a|  <  1.  Thus,  in  this  special  case,  if 

Wj(n)  =  anXj(n), 


then 


A  ,  . 
Wj(n) 


o'*  A  n). 


Although  there  may  not  always  be  such  a  simple  relationship  between  Wj(n)  and  x{(n) 
and  w2(n)  and  x2(n),  we  dc  still  have  a  way  of  recovering  Xj(n)  or  x2(n)  if  we  are  given 
Wj(n)  or  w2(n).  This  is  so  because 

Xj{n)  =  c-nWj  (n) 
x2(n)  =  a~nw2(n). 
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These  relations  are  useful  in  practice,  since-  tney  allow  us  to  work  with  exponentially 

weighted  sequences  as  inputs  to  the  system  D.  If  we  desire  the  original  sequences  at 

the  output,  we  can  simply  unweight,  using  the  relations  above.  In  practice,  this  idea 

is  useful  if  we  are  dealing  with  finite-length  sequences  such  that  x(n)  =  0  for  n  <  0  and 

M 

n  >  M,  and  if  we  can  choose  a  so  that  a  is  not  so  small  as  to  introduce  excessive 
rounding  error. 

2.  8  MORE  GENERAL  RATIONAL  z -TRANSFORMS 

Let  us  assume  that  the  z  -transform  of  the  input  to  the  system  D  has  the  form 

mi  m0 

n  (l-a.  z”1 )  n  (1-b.z) 

X(.)  -  Azr  JaLl . JL . ifel - -  .  (52) 

*1  **  O 

n  (i-c.z'1)  n  u-d.z) 
k=l  x  K  '  k=l  K 

In  all  of  our  previous  results  based  on  rational  z -transforms,  we  assumed  that  A  was 
positive  and  real  and  r  =  0.  This  was  to  insure  that  arg  [X(z)]  could  be  defined  as 
single-valued  and  continuous.  Clearly,  there  are  many  interesting  sequences  that  do 
not  have  z-transforms  of  this  form.  For  example,  if  we  allow  A  to  be  positive  or  neg¬ 
ative  and  r  *  0,  we  can  include  most  sequences  of  interest  for  computation.  In  fact, 
finite -length  sequences  have  z-transforms  of  the  form  of  Eq.  52,  with  the  c^  and 
equal  to  zero.  (Note  that  we  have  excluded  zeros  on  the  unit  circle.  These  could  be 
included  in  our  discussion  if  we  were  willing  to  consider  discontinuous  phase  curves  and 
logarithmic  infinities  in  the  log  magnitude.  For  simplicity,  we  shall  take  the  point  of 
view  that  zeros  on  the  unit  circle  have  been  shifted  inside  by  exponential  weighting.)  Let 
us  now  see  how  the  results  previously  presented  can  be  applied  in  this  more  general 
situation. 

When  X(z)  is  actually  the  product  of  two  or  more  z-transforms,  we  shall  assume 
that  each  term  in  the  product  is  written  in  the  form  of  (52).  Thus  the  constant  A  will 
be  the  product  of  the  corresponding  constants  of  the  individual  factors  of  X(z).  For 
example,  if 

X(z)  =  Xt(z)  •  X2(z), 

then 

A  =  A1  '  A2. 

Clearly,  A  will  be  positive  if  Aj  and  A2  are  both  of  the  same  sign,  and  A  will  be  nega¬ 
tive  if  the  signs  of  Aj  and  A2  differ.  That  is,  by  consideration  of  the  sign  of  A  it  will 
only  be  possible  to  determine  the  sign  of  Aj  relative  to  the  sign  of  A?.  In  most  situations, 
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the  appropriate  signs  will  be  clear  from  consideration  of  the  source  of  the  signals  or, 
in  many  cases,  the  signs  will  not  be  important. 

The  constant  term  A  contributes  an  integer  multiple  of  n  to  the  phase.  Since  we 
can  only  determine  whether  A  is  positive  or  negative,  we  normally  test  to  see  if  A  is 
positive  or  negative  before  computing  the  phase.  If  A  is  negative,  we  can  change  the 
sign  of  X(eff^w)  to  effectively  remove  any  contribution  to  the  phase  which  is  due  to  the 
sign  of  A.  Whether  A  was  positive  or  negative  can  be  remembered  if  this  information 
is  of  interest.  The  sign  of  A  can  be  determined  by  noting  that 

mi 

n  <i-aki  n  u-bk) 

X(l)  =  X(ej°)  =  A  ^ ^ - .  (53) 

"i  *o 

n  (i-c.)  n  d-d.) 

k=l  K  k=l  K 
,  and 

are  positive;  therefore,  the  sign  of  A  is  the  sign  of  X(l). 

r> 

Let  us  now  consider  the  effect  of  the  factor  z  in  (52).  Assuming  that  the  phase  is 
computed  as  specified  in  section  2.  1,  we  can  write  formally 


d^  are  all  less  than  one  in  magnitude,  all  of  the  factors  in  (53) 


Since  the  a^, 


X(z)  =  log  [zr]  +  log 


Thus,  the  complex  cepstrum  x  consists  of  a  component  having  all  of  the  properties  that 
we  have  previously  discussed  and  a  component  that  is  due  to  the  term  log  [zr].  To  see 
how  our  results  are  modified  by  this  term,  let  us  consider  the  phase  contribution  for 
z  =  e0*^.  We  are  tempted  to  write 

log  [zr]  =  log  [ear  e^wr]  =  or  +  j«r. 

If  we  recall,  however,  that  the  phase  angle  must  be  periodic  in  w  (since  it  is  the  imag¬ 
inary  part  of  a  z-transform),  we  see  that  arg  [e^<r+^r]  must  be  defined  as  in  Fig.  13. 
This  factor  then  adds  a  nonanalytic  component  to  the  imaginary  part  of  log  [X(ea+"W)]. 
Formally,  the  contribution  to  the  complex  cepstrum  of  this  type  of  term  is 

<rn  rn 

0(n)  =  \  r  log  [eff+JW]  eJwn  du. 

-ir 


Performing  the  indicated  integration  shows  that 


6(n)  =  r  e 


an  cos  w  n 
n 


=  a  r 


n  *  0 
n  =  0. 


(55a) 

(55b) 
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Fig.  13.  Phase  curve  attributable  to  a  factor  zr  when  z  =  eff+^w. 

This  sequence  is  stable  only  if  the  contour  of  integration  is  the  unit  circle  (vs  0).  In 
fact,  the  sequence,  strictly  speaking,  only  has  a  discrete  Fourier  transform 


e(ej“)=  2 


0{n)  e* 


since  log  z  has  no  Laurent  series  expansion  about  z  *  0.  This  situation  is  analogous 

2'1 

to  the  continuous -time  function  (1+t  )  ,  which  has  no  two-sided  Laplace  transform  but 
does  have  a  Fourier  transform. 

Usually,  we  prefer  to  remove  the  linear-phase  component  before  computing  the 
complex  cepstrum.  This  is  easily  done  once  the  phase  curve  is  computed,  and 
clearly  its  removal  simply  corresponds  to  a  shift  of  r  samples  in  the  input  sequence. 
This  value  of  r  can  be  saved  and  used  to  shift  the  output  of  D-1,  if  this  is  appro¬ 
priate.  The  parameter  r  is  very  much  like  the  sign  of  A,  in  that  if  X(z)  is  the 

product  of  Xj(z)  and  X2(z),  each  having  the  form  of  Eq,  52,  then  r  =  +  r2  and 

it  will  only  be  possible  to  determine  r  j  and  r2  from  consideration  of  the  source  of 
the  sequence  x. 

Thus,  the  complex  cepstrum  of  a  sequence  whose  z-transform  is  of  the  form  of  (52) 
is  normally  obtained  as  follows.  Choose  the  contour  C  to  be  the  unit  circle,  and 

find  the  contributions  that  are  due  to  all  of  the  factors  except  zr,  using  either  the  power 

series  expansion  or  the  integral  relations.  We  may  then  simply  add  to  this  the 
component  6(n)  given  by  Eq.  55.  For  example,  we  could  write  for  sequences  whose 
z -transforms  are  of  the  form  of  (52),  and  choosing  v  =  0 
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x(n) 


r  cos  nn 
n 


2njn 


1 


X'(z) 

X(z) 


n-1  . 
z  dz 


n  #  0 


=  log  A 


0. 


(56) 


(57) 


We  note  that  the  integral  in  (56)  may  be  evaluated  for  n  =  0  if  we  do  not  divide  by  n. 
In  fact,  it  is  easily  shown  from  the  principle  of  the  argument  that  the  value  of  r  can  be 
obtained  from 


=  2wj  §e 


X'(z) 


dz. 


C  X(z) 


2.  9  EXAMPLES  OF  COMPLEX  CEPSTRA 

We  shall  give  several  examples  of  the  analytical  determination  of  the  complex  cep* 
strum.  These  examples,  although  simple,  are  somewhat  typical  of  the  kinds  of 
sequences  that  will  be  encountered  in  practice. 

Example  1:  Minimum-phase  sequence 
Let  x(n)  have  the  form 


x(n)  =  0 


=  a 


n  <  0 
n  0, 


where  |a|  <1. 

The  z-transform  of  this  sequence  is 
00 


X(z) 


=  Y  anz-n  - - - - r  for  |  z  |  >  |  a  | . 

n=0  1  ~ 


The  complex  logarithm  of  X(z)  is 
X(z)  =  -log  (1-az"1). 

Since  la!  <1,  we  see  that  x  is  a  minimum-phase  sequence. 

/\ 

Let  us  choose  the  region  of  convergence  of  X(z)  to  include  the  unit  circle  so  that  this 
can  be  the  contour  of  integration  in  determining  the  complex  cepstrum.  In  this  case  we 
can  obtain  x(n)  by  three  different  methods. 


(a)  Integral  Formula 

Since  we  observe  that  arg  [X(e^w)J  is  continuous  everywhere  for  this  example,  x(n) 
is  determined  by  the  equation 
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,  .  r  -zX'  (z)  , 

where  -zX'(*)/X(z)  is  given  by 


-zX'<z)_  ag-l 
X(z)  1  -  az~ 


Thus  we  see  that 


This  integral  can  be  evaluated  by  using  the  residue  theor  em,  to  obtain 


% 

11 


a.  ,  a 
x(n)  =  — 


log  (1)  =  0  n  =  0 
0  n  <  0 


(b)  Power  Series  Expansion  for  X(z) 

/\ 

Recall  that  X(z)  is  given  by 

X(z)  =  -log  (1-az-1)  =  ^  IT*""- 


where  the  power  series  expansion  for  -log  (1-az""1)  is  valid  in  the  region  |z|  >  |a|.  B ' 

A 

definition,  X(z)  is  also  given  by 


w 

=  2 


*(n)  z“n. 


By  comparison  of  the  two  power  series,  we  see  that 


x(n)  =  0 


n  «  0 


n  >  0. 


(c)  Recursive  Formula 

Let  us  compute  several  values  of  x(n),  using  the  recursive  formula.  Since  x(0)  *  1, 
x(0)  =  log  x,0)  =  0. 


37 


i 

i 


By  applying  the  recursion  formula,  we  obtain 

/s  *U> 

x(l)  = - =  a 

x(0) 

x{2)  .  o  i  j 

x(2)  = - 4^(1)  x<l)  =  a2  -|a2  =4 

x{0)  2  2  2 

x<3>  =  —  -  i-  (x(l)j c(2)+2x<2)x<l))  =  a3  -4  (a3+a3)  = 
x(0)  3  3 

We  note  that  because  the  sequence  is  minimum-phase,  the  complex  cepstrum  is  zero 
for  n  <  0,  and  that  our  result  agrees  with  that  obtained  by  using  the  Hilbert  trans¬ 
form. 

Example  2:  Nonminimum-phase  finite  length  sequence 
Let  x(n)  be  given  by 

x(n)  =  0  for  n  <  0  and  n  >  M 

=  bn  0  «  n  <  M. 

The  z -transform  is  given  by 


■  2  bVn  =  iT 


.  M  -M 
-  b  z 


Let  us  assume  that  |b|  >1.  Then  X(z)  has  M  -  1  zeros  located  as  shown  in  Fig.  14. 


Fig.  14.  Zeros  of  X(z)  =  —  ~-P  - - 

1  -  bz"1 


Since  the  sequence  is  finite -length,  the  region  of  convergence  of  X'z)  is  the  entire 
z  plane  except  for  z  =  0.  If  we  write  X(z)  in  the  form  of  Eq.  52,  we  obtain 


The  complex  logarithm  of  this  expression  is 


x(z)  =  log  bM-1  +  log  z"(M_1)  +  log  (l-b“MzM)  -  log  (l-b_1z). 

If  the  region  of  convergence  of  X(z)  is  chosen  to  be  the  region  fz|  <  |b|,  then 

S'.)  =  log  bM->  *  log  ,-<*-»  ♦  l  £  z"  -  f  zM”. 

n=l  n=l 

If  we  introduce  the  symbol  6(n) 

6(n)  =1  n  =  0 

=  0  n  *  0, 

we  can  write  the  inverse  transform  on  the  unit  circle  as 


z\  M_i  tr  (1-M)  cos  irk 

x(n)  =  (log  bM  1 )  6(n)  +  £  - k - 

k=-oo 


cos  irk  r-  .  Mk 

^ - 6(n-k)  +  2  \“6(n-kM) 


k=-oo 


V  hk 

-  Z  IT 6(n-k)- 


k=-oo 


Therefore  we  observe  that  we  obtain  a  stable  sequence  that  is  nonzero  for  both  positive 
and  negative  n.  We  can  also  see  that  if  the  sequence  were  shifted  to  the  left  by  M  - 1 
samples,  we  would  remove  the  term 


(1-M)  cos  k 
- ^ - 6(n-k) 


k=-°o 


from  the  complex  cepstrum.  That  is,  the  sequence  whose  values  are 
s(n)  =  x(n+M-l) 
will  have  the  z -transform 


S(z)  =  b 


.-M  M. 
M-l  (1~b  2  > 

(l-b_17.) 


Thus  we  can  easily  see  that  the  complex  cepstrum  s(n)  will  be  zero  n  >  0.  This  is  so 


because  the  shifted  sequence  is  maximum-phase.  The  shifting  of  sequences  to  remove 
linear  phase  components  is  quite  important  and  will  be  discussed  in  detail  in  Section  III. 

Example  3:  Repeated  pulses 

Suppose  that  the  sequence  x  has  values 

x(n)  =  s(n)  +  s(n-nQ)  +  . . .  +  s(n-{P-l)no), 

where  s  is  an  aperiodic  sequence  like  those  of  Examples  1  and  2.  The  sequence  x  can 
also  be  expressed  as 

x  =  s  ®  u, 

where  u  is  a  sequence  whose  values  are  given  by 
P-1 

u(n)  =  ^  6(n-knQ). 

k^O 


The  z-transform  X(z)  will  have  the  form 


X(z)  =  S(z)  •  U(z), 


where  U(z)  is 

P-1 

•  l 

k=0 


-n  k 
o 


The  region  of  convergence  of  U(z)  is  the  entire  z- plane  except  for  z  =  0.  We  note  that 
U(z)  has  zeros  at  equal  angular  spacing  around  the  unit  circle. 

The  complex  logarithm  of  X(z)  is 

X(z)  =  log  [S(z)]  +  log  [U(z)]  =  S(z)  +  U(z). 

A 

Let  us  now  choose  a  region  of  convergence  for  X(z).  In  this  case  it  is  not  possible  to 
choose  the  region  of  convergence  to  contain  the  unit  circle,  since  U(z)  has  all  its  zeros 
there.  This  means  that  if  S(z)  is  non  minimum-phase,  it  will  not  be  possible  to  obtain 
a  stable  complex  cepstrum.  We  can  remove  this  difficulty  by  using  exponential  weighting 
of  the  sequence  x.  We  know  that  if 

Xj{n)  -  «nx(n), 

then 

Xj(z)  ■  X(a_Iz)  =  S(«_1z)  •  Ufa^z)  =  Sj(z)  .  Uj{z), 
where  Sj(z)  is  the  z-transform  of  the  sequence  whose  values  are 
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Sj(n)  =  a  s(n), 

and  Uj(z)  is  the  z  -transform  of  the  sequence 

„  Pv  »„■> 

Uj(n)  =  anu(n)  =  )  a  °  6(n-kno>. 


Thus  Uj(z)  is 


Uj(z)  = 


(,  Pno-P"o'| 
-  U-«  z  ; 

/  n  -n  \ 


Now,  we  see  that  the  complex  logarithm  of  Xj  (z)  is 


a  a  (  Pn  -Pn_\  /  n  -n  \ 

Xj(z)  =  log  [S1(z>]  +  log  [Ujfz)]  =  Sj(z)  +  log  \l-a  °z  °)  -  log  Vl-«  °z  °). 

Since  the  zeros  of  U^(z)  now  lie  on  a  circle  with  radius  |a|,  we  see  that  the  sequence  Uj 
can  be  made  minimum-phase  by  making  |a|  <  1.  In  this  case  it  would  be  possible  to  use 
the  unit  circle  as  the  contour  of  integration  for  obtaining  Xj(n),  which  would  be 


oo  n  k 


oo  Pnk 


Xj(n)  =  s'j  (n)  +  ^  6(n-knQ)  -  ^  — 6(n-PnQk). 


Suppose  that  g  is  the  sequence  of  Example  2,  after  shifting  it  left  by  M  -  1  samples, 
that  is, 


s{n)  =  b 


n+M-1 


M  -  1  <  n  0 


a  0  elsewhere. 

Then  the  sequence  x  would  appear  as  in  Fig.  15a,  where  P  =  4  and  M  >  nQ.  The  samples 
of  the  discrete -time  function  have  been  connected  by  a  smooth  curve  for  convenience  in 
plotting.  The  weighted  sequence  Xj  is  shown  in  Fig.  15b,  and  the  resulting  complex  cep- 
strum  is  shown  in  Fig.  15c. 

This  simple  example  points  out  several  things  that  are  true  in  general: 

1.  The  input  x^  was  the  convolution  of  a  minimum-phase  sequence  with  a  maximum- 
phase  sequence.  The  resulting  complex  cepstrum  consists  of  a  part  that  is  zero  for 
n  <  0  because  of  the  minimum-phase  part  of  the  input  and  a  part  that  is  zero  for  n  >  0 
because  of  the  maximum  phase  part  of  the  input.  This  is  always  true  because  any 
sequence  can  be  expressed  as  the  convolution  of  a  minimum-phase  sequence  and  a 
maximum-phase  sequence  (with  possibly  some  time  shift).  In  some  cases  either  the 
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Fig.  15.  (a)  Input  sequence  x.  (b)  Exponentially  weighted  sequence  Xj(n)  =  anx(n). 

(c)  Complex  cepstrum  of  Xj. 

minimum-phase  part  or  the  maximum-phase  part  may  be  all  that  is  of  interest,  while 
in  general  this  may  not  be  true.  When  it  is  true,  considerable  simplification  results 
if  we  take  advantage  of  the  properties  of  minimum-phase  sequences. 

2.  The  input  sequence  Uj  had  samples  spaced  at  intervals  of  nQ,  rather  than  1.  The 
resulting  complex  cepstrum  has  its  samples  at  the  same  spacing  nQ.  This  can  be  shown 
to  be  true  in  general  if  the  spacing  of  the  samples  is  uniform.  It  is  also  true  approxi¬ 
mately  if  the  samples  are  unequally  spaced,  but  it  is  difficult  to  obtain  precise  results 
on  this  problem.  Examples  of  this  are  given  in  Section  V. 

3  The  input  sequence  Sj  was  "pulselike";  that  is,  it  had  most  of  its  samples  con¬ 
centrated  in  a  small  region  relative  to  the  total  duration  of  Uj .  The  samples  of  Sj  were 
spaced  at  umt  intervals.  The  complex  cepstrum  of  Sj  is  seen  to  approach  0  i>u  (ab)~r,/n, 
so  that  it  dies  out  at  a  fairly  rapid  rate.  We  have  seen  that  it  is  true  in  general  that  the 
complex  cepstmm  dies  out  at  least  as  rapidly  as  1 1  /n  {  for  all  n. 

2.  10  LINEAR  SYSTEM  IN  THE  CANONIC  REPRESENTATION 

We  have  discussed  in  detail  the  anr:.;  .s  and  realization  of  the  characteristic  sys- 

—  f 

tem  D  (and  therefore  its  inverse  D  *).  We  wish  now  to  discuss  the  general  type  of  lin¬ 
ear  system  that  has  been  found  useful  in  separating  convolved  signals.  We  shall  leave 
specific  examples  to  be  covered  in  the  sections  on  applications. 
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As  a  very  general  comment,  we  can  say  that  the  type  of  signals  for  which  homomor¬ 
phic  deconvolution  has  thus  far  been  found  useful  are  those  that  are  convolutions  of  sig¬ 
nals  whose  complex  cepstra,  in  some  sense,  do  not  overlap.  An  obvious  example  of  this 
is  when  one  signal  is  minimum-phase  and  the  other  is  maximum -phase.  This  was  the 
case  in  Example  3.  The  second  basic  situation  is  also  indicated  by  Example  3.  We 
observe  that  the  complex  cepstrum  of  the  "impulse  train"  Uj  has  isolated  samples 
occurring  with  spacing  nQ.  Suppose  that  Sj  is  a  sequence  whose  complex  cepstrum  dies 
out  rapidly  so  that  Sj(n)  is  small  for  n  «  nQ  (Sj  may  not  in  general  be  maximum-phase 
as  in  the  example).  If  we  have  as  the  input 

Xj  =  Sj  ®  Uj, 


then  we  see  that  the  complex  cepstra  s}  and  Uj  will  be  in  a  sense  separated  in  "time"  in 
the  complex  cepstrum  of  Xj. 

Both  of  these  situations  suggest  that  the  kind  of  linear  system  that  we  should  use  is 
of  the  form 

y(n)  =  £(n)  x(n).  (58) 

Such  systems  will  be  called  frequency-invariant,  in  analogy  with  time -invariant  linear 
systems  in  which  we  multiply  z -transforms  and  convolve  time  functions.  For  frequency 
invariant  linear  systems  we  multiply  time  functions  as  in  Eq.  58  and,  therefore,  we  con¬ 
volve  frequency  functions  as  in 

Y(ejw)  =  P  X(eje)  L(ej(w-*))  d|,  (59) 

-IT 


where  L(z)  is  the  z -transform  of  the  sequence  whose  values  are  £(n). 

As  a  general  comment,  let  us  consider  an  interesting  possibility  that  results  if  the 
input  is  of  the  form 

x(n)  =  ax  |  (n)  +  bx2(n). 


The  transform  of  this  equation  is 
X(eJU-)  =  aX  j  (e^u)  +  bX2(ejw). 


(See  the  Appendix  for  a  discussion  of  scalar  multiplication.)  Suppose  that  we  filter  the 

real  and  imaginary  parts  of  X(e^w)  with  different  linear  systems  L  (e^w)  and  L.(e^w), 
/\  r  i 

respectively.  Thus  Y(eJ  )  would  be  of  the  form 


Y(e' 


,M  = 


_1_ 

2tt 


r 


Xr(ej5) 


Lr(eJ(w-^5 


'i  + 


JL 

2ir 


r 


jX^e^)  Li(ej(“^)) 


d£. 
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(a) 


Fig.  16.  (a)  Frequency-invariant  system  that  is  linear  for  real  scalars, 
(b)  General  linear  frequency-invariant  linear  system. 


It  can  be  shown  that  if  and  only  if  a  and  b  are  real  numbers, 

Y(eJw)  =  aYjteh  +  bY^). 

Thus  for  real  scalars  we  may  filter  the  magnitude  and  phase  with  separate  filters. 
Therefore  we  see  that  in  general  the  linear  system  can  be  of  the  form 

y(n)  =  lr(n)  Ev  [x(n)  ]  +  £.(n)  Odd  [x(n)J,  (60) 


where 


x(n)  =  Ev  [x(n)]  +  Odd  [x(n)]p 

and  j?r(n)  and  £^(n)  are  the  inverse  transforms  of  Lr(e^w)  and  L.(e’  ’),  respectively.  The 
operations  suggested  by  Eq.  60  are  summarized  in  Fig.  16.  Figure  16a  illustrates  the 
general  case  and  Fig.  16b,  the  case 

£r(n)  =  £i(n)  =  £(n). 
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III.  COMPUTATIONAL  CONSIDERATIONS  IN 
HOMOMORPHIC  DECONVOLUTION 


We  have  given  a  detailed  analysis  of  the  canonic  form  for  homomorphic  deconvolu¬ 
tion.  This  analysis  was  carried  out  for  discrete  convolutions  and  it  leaned  heavily  on 
the  z-transform  both  in  the  analysis  and  realization  of  the  system.  When  we  turn  to  the 
actual  computational  realization  (in  the  form  of  digital-computer  programs),  we  find 
that  we  must  somewhat  modify  the  results  of  Section  II.  The  main  reason  for  this  is 
that  the  z-transform  (usually  its  unit  circle  evaluation)  is  a  function  of  a  continuous 
complex  variable  z  (or  <*>).  Since  digital  computers  deal  with  finite  collections  of  num¬ 
bers  rather  than  functions,  it  is  clear  that  we  must  be  content  with  only  a  finite  number 
of  values  of  the  z-transform.  Thus  we  are  led  to  the  study  of  the  sampled  z-transform 
and  its  properties. 

The  other  major  consideration  is  the  calculation  of  the  phase  of  the  sampled 
z-transform.  We  shall  apply  the  results  of  Section  II  to  show  what  properties  the  sam¬ 
pled  phase  function  must  have  and  then  we  shall  show  an  algorithm  with  which  we  can 
compute  the  proper  phase. 

Therefore  our  present  major  purpose  is  to  show  how  the  ideas  of  Section  II  can  be 
translated  into  programs  for  a  digital  computer.  These  programs  will  constitute  our 
realization  of  the  canonic  system  of  Fig.  4. 

3.  1  SAMPLED  z-TRANSFORM 

As  we  can  see,  the  use  of  the  z-transform  is  very  convenient  in  the  analysis  of 
homomorphic  deconvolution.  If  we  wish,  however,  to  use  a  computer  to  evaluate  the 
relations  that  we  obtained  in  Section  II,  we  must  deal  with  only  a  finite  number  of  values 
of  the  z-transform.  For  the  same  reason,  we  shall  be  limited  at  the  outset  to  finite- 
length  sequences. 

Let  us  consider  a  sequence  x  whose  z-transform  has  a  region  cf  convergence 
including  the  unit  circle.  (This  will  always  be  true  for  finite  length  sequences.)  We  can 
therefore  evaluate  X(z)  for  z  =  so  that  we  obtain 

00 

X(e^)  =  ^  x(n)  e~jwn.  (61) 

n=-on 


Henceforth,  (61),  which  is  a  function  of  the  continuous  variable  u>,  will  be  referred  to 
as  the  Fourier  transform  (FT)  of  the  sequence  x.  The  inverse  Fourier  transform 
(IFT)  of  X(ejw)  is 

x(n)  =  -  X(ejw)  e^wn  dw  =  ^  jj*  X(ejw)  e?™  du>.  (62) 

We  note  that  the  limits  of  integration  in  (62)  can  be  any  convenient  interval  of  length  2w, 
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since  both  X(e^w)  and  e^wn  are  periodic  with  period  2tt.  For  computation  on  a  digital 
computer,  we  must  be  content  with  only  a  finite  collection  of  samples  of  £q,  61.  This 
leads  us  to  consider  a  sampling  theorem  for  the  transform  X(e^u).  Let  us  suppose  that 
we  have  samples  at  exactly  N  values  of  w,  that  is,  at  N  points  around  the  unit  circle 
spaced  at  equal  angular  increments  of  2it/N.  Thus  we  obtain  the  sequence  of  samples 


n=-oo 


-jjfkn 


x(n)  e 


for  k  =  0, 1 . N-l. 


(63) 


(We  choose  to  use  positive  values  of  k  for  computational  convenience.)  Corresponding 
to  (62),  we  have 


To  see  how  x(n)  is  related  to  x(n),  let  us  substitute  (63)  in  (64).  The  result  is 

M 


If  we  interchange  the  order  of  summation,  we  obtain 

N-l 


00 


x(n)=  ^  x(m)jji-^r  exp^j  ~  k(n-m)^ 

m=-oo  \  k=0 


(65) 


Let  us  define 


N-l 


d(n-m)=~Y  exp(j^Jk(n-m)). 


U 

k=C 


(66) 


It  can  be  shown  that  the  sequence  d  is  periodic  with  period  N  and  that  it  can  be 
represented  as 


00 

d(n-m)  -  6(n-m+rN), 

r=-oo 


(67) 


where 


Co 


6(n)  =  < 


1 


n  *  0 

n  =  0. 


46 


Therefore  (65)  becomes 


wu 

i'->-  I 


x(m)  d(n-m) 


wu 

-I 


x(n+rN). 


m=-oo 


r=-oo 


(68) 


Equation  68  shows  that  jc(n)  is  periodic  with  period  N,  and  that  it  is  possible  for  x(n)  to 
be  equal  to  x(n)  for  certain  values  of  n  if  the  original  sequence  is  of  finite  length.  For 
example,  let  us  assume  that  the  sequence  x  has  values  x(n)  such  that  x(n)  =  0  for  n  <  0 
and  n  >  M.  It  is  clear  from  (68)  that  if  N  >M,  then 


x(n)  =  x(n)  for  0  «  n  <  N. 


On  the  other  hand,  if  N  <  M,  we  encounter  aliasing  in  attempting  to  return  to  the 
sequence  from  the  samples  of  X(e^w).  This  is  shown  in  the  simple  example  of  Fig.  17. 
Figure  17a  shows  the  original  sequence  x,  wherein  we  can  see  that  the  length  of  the 
sequence  is  5  samples  (M=5).  Using  (68),  we  have  plotted  the  sequence  x  for  two  dif¬ 
ferent  values  of  N.  In  Fig.  1 7b  we  show  the  sequence  x  when  we  have  sampled  X(fcJw) 


(a) 


Fig.  17.  (a)  Finite- length  sequence  of  5  samples,  (b)  Periodic  sequence  cor¬ 
responding  to  sampling  the  z-transform  at  5  points  on  the  unit  circle, 
(c)  Periodic  sequence  corresponding  to  sampling  the  z-transform  at 
4  points  on  the  unit  circle.  (Open  circles  indicate  that  two  values 
overlap  at  n  -  0,  ±4,  ±8 .  This  effect  is  called  aliasing.) 
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five  times  (N=5).  In  the  interval  0  «  n  <  5,  x(n)  =  x(n),  and  the  sequence  x  is  periodic 
with  N  =  5.  In  Fig.  17c  x  is  shown  for  the  case  in  which  we  have  sampled  X(e^w)  only 
four  times.  We  note  that  x  is  periodic  with  period  N  =  4;  however,  two  points  of  the 
original  sequence  occur  at  the  same  value  of  n  in  x  so  that  x(0)  *  x(n).  This  overlap 
in  the  periodic  sequence  x  is  known  as  aliasing.  Clearly,  the  way  io  avoid  aliasing  is 
to  insure  that  N  >  M;  that  is,  we  must  sample  X(e^w)  at  a  high  enough  rate. 

If  we  assume  that  N  >  M,  we  can  write  the  following  pair  of  relationships  for  finite 
length  sequences: 


N-l 


X(k) 


I 


.  2w . 

-J-ffkn 


k  =  0,  1 . N-l, 


n=0 


(69) 


N-l  .  2it. 

i  r  j  v 

x(n)  e  n=  0,1 . N-l. 

k=0 


(70) 


We  note  that  both  x  and  X  are  periodic  with  period  N.  That  is, 
x(n)  »  x(n+rN)  for  r  =  0,  ±1,  ±2, . . . 

W  M 

X(k)  «  X(k+rN)  for  r  =  0,  ±1 ,  ±2,. . .  . 

Anot  tie "  way  of  expressing  this  fact  is  to  say  that  in  Eq.  69  and  Eq.  70,  all  integers  n, 
k,  and  kn  are  to  be  interpreted  modulo  N.  Equations  69  and  70  have  been  referred  to 
as  a  Discrete  Fourier  Transform  (DFT)  pair.1** 

Let  us  now  consider  how  (69)  and  (70)  can  be  used  in  the  realization  of  the  charac¬ 
teristic  system  D.  We  shall  replace  all  z-transforms  by  Eq.  69  (the  DFT)  and  all 
inverse  z-transforms  by  the  inverse  discrete  Fourier  transform  (IDFT)  of  Eq.  70.  Since 
our  interest  is  in  convolutions  of  sequences,  we  must  consider  the  effect  of  sampling 
the  z-transform  when  the  input  is  a  convolution.  Let  us  assume  that  Xj  and  x 2  are 
finite-length  sequences  such  that 

Xj(n)  =  0  for  n  <  0  and  n  >  M^, 

x2(n)  =  0  for  n  <  0  and  n  > 

The  convolution  of  Xj  and  x2  has  values  given  by 

n  n 

x(n)=  )  Xj(n-r)  x2(r)  =  )  Xj(r)  x2(n-r).  (71) 

r=0  r=Q 


We  can  see  that  the  sequence  x  is  also  finite  in  length  and  that 
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x(n)  =  0 


x(n)  =  0  for  n  <  0  and  n  >  Mj  + 

The  FT  of  the  sequence  x  is  X(e^w)  and  is  given  by 
X(«>)  =  X^e*")  •  X2(ejw). 

Let  us  assume  that  we  sample  X(e^w)  at  N  points  to  obtain  the  samples 

X\e  W  )  sX^e  N  /•  X2\e  N  j,  k  =  0, 1 . N-l.  (72 

We  have  seen  that  if  N  is  not  large  enough,  aliasing  will  occur.  To  see  the  nature  of 
this  aliasing  for  convolutions,  let  us  note  that  (72)  can  be  written  in  terms  of  the  DFT. 

X(k)  =  X,  (k)  •  X,(k)  k  =  0,  1 . N-l.  (73 

where  k  is  taken  modulo  N.  The  IDFT  of  (73)  can  be  shown  to  be 
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Fig.  18.  (a)  and  (b)  Aperiodic  finite-length  sequences,  (c)  Discrete 
convolution  of  Xj  and  x2.  (d)  Periodic  sequences  for  circu¬ 
lar  convolution  when  N  <  Mj  +  M,,.  (e)  Periodic  sequences 
for  circular  convolutie -•  when  N  >  Mj  +  M?. 


(74) 


N-l 


N-l 


x(n)  =  ^  x^n-r)  x2(r)  =  ^  x^r)  x2(n-r), 


r=0 


r=0 


where  r  and  n-r  are  both  to  be  taken  modulo  N,  and  n  =  0,  1,  ....  N-l.  That  is,  x  is 
the  result  of  a  circular  or  periodic  convolution  as  opposed  to  the  original  sequence  x 
which  was  the  result  of  an  aperiodic  convolution.  This  point  is  illustrated  by  Fig.  18. 

In  Fig.  18a  and  18b  we  show  two  finite  length  sequences  of  lengths  and  M2<  respec¬ 
tively.  Note  that  Mj  <  M2-  Figure  18c  shows  Xj(n-r)  and  x2(r)  plotted  as  a  function 
of  r,  as  is  required  for  convolution.  The  value  of  x(n)  is  obtained  by  adding  the  prod¬ 
ucts  Xj(n~r)  x2(r).  From  this  figure,  we  can  clearly  see  that  x(n)  will  be  zero  for 
n  <  0  and  n  >  Mj  4  M 2>  since  the  sequences  do  not  overlap  for  these  values.  Figure  18d 
shows  Xj(n-r)  and  x2(r)  as  would  be  obtained  for  N  =  M2  +  1.  Although  neither  of  the 
sequences  Xj  or  x2  differ  from  Xj  and  x2,  respectively,  in  the  interval  0  «  n  <  N,  it 
is  clear  that  the  circular  convolution  will  not  be  equal  to  the  aperiodic  convolution  for 
this  value  of  N.  Figure  18e  shows  x.(n-r)  and  x„(r)  for  N  =  M.  4-  M„  4-  1.  In  this  case, 

~  d.  1  C. 

it  is  clear  that  the  periodic  convolution  x  will  equal  the  aperiodic  convolution  x  in  the 
interval  0  «  n  <  N.  Thus,  if  we  sample  the  z-transform  at  a  high  enough  rate,  the  DFT 
X(k)  can  represent  an  aperiodic  convolution  with  no  aliasing. 

Let  us  now  consider  how  we  can  compute  the  complex  cepstrum  using  the  DFT.  We 
recall  that  we  have  defined 

W")  *  log  [X(eh]. 

A 

If  we  sample  X(»J  )  at  N  equally  spaced  values  of  u>,  we  obtain 


A 

X 


log 


X\e 


j2lkM 

J  N  K 


k=  0.  1 . N-l. 

The  samples  of  X(e^w),  of  course,  can  be  directly  evaluated  by  using  the  DFT  relation- 
ship  of  (69)-  Therefore,  we  can  write 

A 


X(k)  =  log  [X(k)j 


k  =  0,  1 . N-l, 


(75) 


If  we  apply  the  IDFT,  we  obtain 

N-l  ,2ir 

$(«>  =  £  ^  lo*  lx<k>3  e  N 

k=0 

In  general,  x  is  not  a  finite-length  sequence  so  that  we  should  expect  some  aliasing 
to  occur  in  x.  Equation  68  shows,  in  fact,  that  we  must  write  x(n)  as 


n-0,  1, _ N-l. 


(76) 


so 

A  V  A 

x(n)  =  >  x(nt-rN). 

pa-00 


(77) 
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where  $(n)  is  periodic  with  period  N.  The  aliasing  of  the  complex  cepstrum  may  be  a 
problem  in  some  cases  and  in  others  it  may  not.  We  recall  that  in  general 


J  x(n)|  <  A  -j-  for  all  n  #  0, 


so  that  it  is  possible,  by  making  N  large,  to  eliminate  most  of  the  aliasing  of  the  com¬ 
plex  cepstrum.  By  this  we  mean  that  if  x(n)  =  0  for  n  <  0  and  n  >  M,  then  by  choosing 
N  »  M  and  defining 


x(n)  =  x(n) 


0  «?  n  <  M 


M  «=  n  <  N, 


there  will  not  be  as  much  overlap  between  periods  of  $  as  if  we  had  chosen  N  =  M  +  1 , 
as  it  is  clearly  possible  to  dc.  Therefore  the  aliasing  of  $  is  minimized  by  choosing 
N  as  large  as  possible  consistent  with  the  computer  storage  and  computation  time  con¬ 
straints  and  then  padding  the  M  +  1  samples  of  x  with  zeros. 

The  numerical  operations  of  computing  the  complex  cepstrum  are  summarized  in 
the  following  equations. 


x(n)  e 


.  Zn , 

“J  to 


k  =  0,1,...,  N-l, 


X(k)  =  log  [X(k)J  k  =  0,  1 . N-l, 


N-l  .  2tt. 

a.  .  I  V  .  J  N kn 

2(n)  =  n  Z  ~(k)  6 

k=0 


n  =  0.  1 . N-l, 


(78b) 


(78c) 


We  recognize  that  the  complex  cepstrum  that  we  compute  by  using  these  equations  will 
differ  because  of  aliasing  from  the  theoretical  complex  cepstrum  obtained  through  the 
z-transform. 

3.2  FAST  FOURIER  TRANSFORM 

In  1965,  Cooley  and  Tukey14  disclosed  an  algorithm  for  high-speed  calculation  of 
the  DFT.  Since  that  time,  there  has  been  tremendous  interest  in  the  application  of  this 
algorithm  in  many  diverse  areas.  One  of  the  fieldr  in  which  it  has  been  used  with  great 
success  is  that  of  digital  filtering  or  waveform  processing.^’ 17  In  fact,  the  availabil¬ 
ity  of  this  algorithm  allows  us  to  consider  realizing  this  scheme  for  deconvolution. 

To  see  the  nature  of  this  new  method  of  evaluation  of  Fourier  transforms,  let  us 
recall  the  DFT  pair 
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k  =  0,1 . N-l, 


(79a) 


1 


N-l 


X(k)  =  2 


x(n)  e 


.  2ir, 

-J  N5"1 


n=0 


N-l 

x(„)  -  i  l  X(k)  e 
k=0 


.  2ir. 

3  N-kn 


n  =  0, 1, . . . ,  N-l. 


(79b) 


The  fast  Fourier  transform  (FFT)is  just  Eqs.  79a  or  79b  done  in  a  high-speed  way.  We 
note  that  to  evaluate  Eqs.  79  in  a  straightforward  manner  involves  N  complex  multi¬ 
plications  and  additions.  Cooley  and  Tukey  showed  that  if  N  =  2m,  it  is  possible  to 
evaluate  either  (79a)  or  (79b)  by  using  m  iterations  of  a  process  involving  N  complex 

multiplications  and  additions.  Therefore  the  total  number  of  operations  is  N  log  N, 

2  2 
rather  than  N  .  Clearly,  if  N  is  only  moderately  large,  the  amount  of  computation  (and 

thus  computing  time)  is  considerably  reduced  by  uaing  the  FFT  algorithm. 

14-17  "  ' 

Since  the  publications  on  the  subject  of  the  FFT  have  mushroomed  in  the 
several  years  since  the  appearance  of  the  original  Cooley-Takey  paper,  ve  shall  not 
discuss  the  algorithm  and  its  programming.  We  shall,  however,  be  interested  in  the 
properties  of  the  FFT  (or  DFT),  many  of  which  are  slightly  different  from  corre¬ 
sponding  ones  for  the  z-transform  or  Fourier  transform  of  a  sequence.  This  difference 
is  usually  a  result  of  the  periodicity  of  both  X(k)  and  x(n).  We  have  already  seen  an 
example  of  this  in  the  case  of  convolution.  A  table  of  the  useful  relations  and  symme¬ 
tries  is  given  by  Gentleman  and  Sande. 1 5 


3.  3  PROPERTIES  OF  THE  SAMPLED-PHASE  CURVES 

We  shall  consider  the  problem  it  computing  the  samples  of  arg  [X(e^u’)J  from  the 
samples  of  X(e^w).  We  have  shown  that  arg  [X(e^w)]  must  be  continuous  for  -«  <  w  <  it 
and  that  it  must  be  odd  and  periodic  with  period  2*.  We  recall  that  for  rational 
z-transforms,  arg  [X(e^w)]  is  in  general  discontinuous  at  u=  ±ir,  ±3,  ....  because  of 
the  linear  phase  component.  These  conditions  imply  a  similar  set  of  conditions  on  the 

samples  of  arg  [X(er*u')J,  that  is,  on  arg  [X(k)j,  for  k  =  0,  1,2 . N-l.  These  condi- 

tions  are  given  below  under  the  assumption  that  we  have  sampled  X(e*w)  at  a  suffi¬ 
ciently  high  rate  so  that  the  complex  cepstrum  will  not  be  severely  aliased.  These 
conditions  are  as  follows. 

Cl.  The  inequality 

!  arg(X(k)j-  arg{X(k+l  )JJ  <c 

must  hold  for  0  k  <  N/2  -  1  and  for  N/2  +  1  «  k  <  N-l,  wtrre  £  is  a  toler¬ 
ance  depending  on  N  (that  is,  the  rate  of  sarnpln.g  of  X(c’  ')). 
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C2.  arg  [X(k)J  is  an  odd  function  of  k;  that  is, 
arg  fX(k)]=  arg  [X(N-k>] 

for  k  =  0,  1 . N-l,  with  k  and  N-k  taken  modulo  N. 

C3.  arg  [X(k)]  is  periodic  with  period  N;  that  is, 

arg  fX(k)]  =  arg  fX(k+rN)]  r  =  0,*1,±2 . 

where  k  =  0,  1 . N-l,  and  k  and  k+rN  are  taken  modulo  N. 

To  see  what  these  conditions  mean,  let  us  consider  the  type  of  phase  curves  to  be 
expected  for  finite-length  sequences.  Consider  a  sequence  x  whose  values  x(n)  satisfy 

x(n)  =  0  for  n  <  0  and  n  >  M. 

The  corresponding  z  -transform  is 


X(,) .  I 
n=0 


M  _  m, 

-m 

x(n)  z'n  =  Az  ° 


m 

o 

n  (i-b  z). 

r=l  r 


(80) 


where  the  ar  and  br  are  all  less  than  one  in  magnitude,  and  M  -  m .  +  ra  .  Equation  80 
places  in  evidence  the  fact  that  in  general  there  will  be  m.  ze*\js  inside  the  unit  circle 
and  mQ  zeros  outside  the  unit  circle.  Since  arg  [X(e*w)j  is  the-  sum  of  the  angles  of  each 
factor  in  Eq.  80,  consideration  of  Figs..  8,  9,  and  13  shows  that  arg  [X(eP“)J  has  the 
properties 

arg  fX(e^w)]  =  0 


arg  [X(eJ“)]  =  -mQit 
arg  [X(eJW>]  =  moir 


for  <*)  =  0,  2w,  4ir, .  . . 

for  u  =  w-y, 3n-y, . .  . , 
for  w  =  w+y, 3w+y, . . . , 


(81a) 

(81b) 

(81c) 


where  y  is  an  arbitrarily  small  positive  number.  In  obtaining  these  equations,  we  have 
assumed  that  A  is  positive. 

Let  us  now  consider  the  corresponding  sampled  Fourier  transform  and  its  sampled 

i  — 

k  J  N 

phase.  We  obtain  X(k)  from  (80)  simply  by  replacing  z  by  W  ,  where  W  =  e  .  There¬ 
fore  we  obtain 


m 


-km  I,  ,  v  o  . 

X(k)  =  AW  °  n  ( 1  -a  W  *  )  n  { 1  -b  WK  ) 
r=l  r=l 

for  k  =  0,  l . N-l.  We  can  determine  the  sign  of  A  by  looking  at 


(82) 


m. 


m 


i  o 

X(0)  =  A  fl  (1-aJ  n  (1-b  ), 


r=l 


r=  1 


since  the  sign  of  A  is  the  same  as  the  sign  of  X(0).  If  A  is  negative,  we  normally  change 
the  sign  of  X(k)  before  computing  arg  [X(k)j,  so  that  we  remove  any  constant  phase 
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f 


•3 


component  caused  by  A.  Henceforth  we  shall  assume  that  A  is  positive. 

The  properties  corresponding  to  Eqs.  81  are  easily  ^uii  to  be 

arg  [X(k)j  =0  for  k  =  0,-|t  N,~,  2 f? .  (83a) 

arg  [X(k)]  = -mQir  for  k  =  -y”  ^^jjT  ”  •  •  •  •  (83b) 

arg  [X(k)l  *  +moir  for  k  =  y+ 1,^+1 .  (83c) 

We  note  that  arg  [X(k)]  is  given  the  value  0  at  k  =  N/2;  3N/2,  ...  so  as  to  satisfy  the 
requirement,  of  odd  symmetry.  These  properties  are  exhibited  by  Fig.  19a  where  we 
have  shown  only  one  half  cycle  of  arg  [X(k)]. 

>>w 


CO*  00 

-2  w 

N/2  k 

•4t 

-♦-4 

>♦*« 

(C) 

Fig.  19. 

(a)  Samples  of  arg  [X(e^w)J.  (b)  Principal 

value  of  arg  [X(e^w)].  (c)  Correction 

sequence  for  obtaining  p.rg  from  ARG. 


These  properties  of  the  sampled  phase  curve  will  be  used  to  discuss  an  algorithm 
for  computing  arg  [X(k)j  from  X(k)  which  in  turn  is  obtained  by  using  the  FFT  algorithm. 

3.4  AN  ALGORITHM  FOR  COMPUTING  arg  [X(k)j 

The  properties  of  arg  [X(k)]  were  given  in  section  3.  3,  These  properties  must  be 
satisfied  by  any  phase  curve  that  we  compute  if  we  want  to  insure  that 
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£(n)  =  £j(n)  +  ^2(n), 

where  x  is  a  finite  length  sequence  of  the  form 
x=  Xj  ®  x2. 

A  typical  sampled  phase  curve  is  shown  <n  Fig.  19a.  In  Fig.  19b,  we  show  the  principal 
value  of  the  curve  in  Fig.  19a.  Let  us  recall  that  we  can  obtain  X(k)  from 

N-l 

X(k)  =  S  x(n)  W-kn  =  X Jk)  3  jX.(k),  (84) 

~  ~r 

n=0 

where  N  is  a  power  of  2  that  is  greater  than  tlie  length  of  the  sequence  x.  Thus  we  are 
given  X  (k)  and  X.(k),  and  we  must  compute  arg  [X(k)]  so  as  to  satisfy  the  properties 
given  in  section  3.3.  This  is  not  as  simple  as  it  may  appear  at  first  glance.  The  prin¬ 
cipal  value  may  be  easily  obtained  by  using  standard  routines  based  on  a  polynomial 
approximation.  Let  us  assume  that  we  have  computed 

-it  <  ARG  [X(k)]  «  ir  (85) 

for  k  =  0,  1,  ....  N-l.  Although  ARG  [X(kj]  does  net  satisfy  our  requirements,  it  can  be 
used  as  i  basis  for  computing  the  correct  phase  curve.  To  see  this,  consider  Fig.  19b. 

We  note  \hat  arg  [X(k)]  can  be  expressed  as  the  sum 

arg  [X(k)J  =  ARG  [X(k)]  +  COR  (k),  (86) 

where  COR  (k)  is  shown  in  Fig.  1 9c  for  that  example.  In  general,  it  is  clear  that  § 

COR  (k)  =  27rq,  ' 

ti 

where  q  is  a  positive  or  negative  integer  that  depends  on  k  so  that  the  properties  of  ' 

section  3.3  are  satisfied  for  all  k.  * 

Our  discussion  suggests  that  we  can  compute  arg  [X(k)]  from  ARG  [X(k)j  by  com¬ 
puting  the  correction  sequence  COR  (k)  and  then  adding  it  to  ARG  [X(k)j.  This  can  be 
done  if  the  sampling  rate  is  high  enough.  Then  ARG  [X(k)]  contains  all  information 
required  to  compute  COR  (k).  In  order  to  see  this,  it  is  convenient  to  make  several 
definitions.  We  say  that  ARG  [X(k)j  has  a  positive  jump  of  2ir  (PJ  of  2w)  at  k  if 

ARG  [X(k+1 )}  -  ARG  [X(k)j  >  2w  -  e  2 ,  (87a) 

where  is  a  positive  number  whose  value  depends  to  some  extent  on  the  rate  at  which 
we  sample  the  phase.  Similarly,  we  say  that  ARG  [X(k)]  has  a  negative  jump  of  2ir  at  k 
(NJ  of  2ir)  if 
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ARC  [X(k+1 )]  -  ARG  [X(k)j  <  -(2tr-C]). 


(87b) 


By  carefully  considering  the  example  of  Fig.  19.  we  see  that  COR  (k)  can  be  com¬ 
puted  by  using  the  following  algorithm: 


COR  (k+1 )  =  COR  (k)  -  2 it 

for  a  PJ  of  2 it  at  k 

(88a) 

COR  (k+1)  =  COR  (k)  +  2* 

for  a  NJ  of  2tt  at  k 

(88b) 

COR  (k+1)  =  COR  (k) 

otherwise, 

(88c) 

where  COR  (0)  =  0  and  k=  0,  1,  . ..,  N-l.  Therefore  we  see  that  we  can  compute 
arg  [X(k)J  in  the  following  steps:  Compute  ARG  [X(k)j  for  k  =  0,  1,  . . . ,  N-l,  using  a 
suitable  routine  based  on  an  inverse  tangent  approximation;  use  Eqs.  87  and  88  to  com¬ 
pute  COR  (k)  for  k  *  0,  1,  . . . ,  N-l;  and  add  COR  (k)  to  ARG  [X(k)]  for  k  =  0.  1.  . . . ,  N-l 
to  obtain  arg  [X(k)].  (We  note  that  this  step  can  be  done,  as  COR  (k)  i3  computed  so  that 
extra  storage  for  COR  (k)  is  not  required.} 

When  we  are  dealing  with  input  sequences  with  many  samples,  we  quite  often  find 
that  there  may  be  several  hundred  zeros  of  X(z)  outside  the  unit  circle.  Since 

[*(1  ’1)]*"moir' 

where  mQ  is  the  number  of  zero"  outside  the  unit  circle,  we  often  find  that  the  linear 
phase  component  is  so  large  that  it  dominates  the  rest  of  the  phase  components.  Let 
us  call  the  sampled  linear  phase  component  ©(k)  .  Therefore  we  find  that  in  order  to 
have  properties  of  section  3.4,  we  require 


0(k)=  --^mok 

N 

°^k<T 

(89a) 

=  0 

ii 

(89b) 

■  1 

|  <  k  <  N. 

(89c) 

It  can  be  shown  that  the  contribution  to  x(n),  which  is  due  to  j©(k),  is 


cos  im 


tan 


tm 

N 


n  =  1,2,...  ,N-1 


=  0 


n  =  0. 


(90) 

(90) 


These  sequences  are  shown  in  Fig.  20.  Just  as  ©(k)  dominates  the  phase  when  mQ  is 
large,  we  find  that  0(n)  dominates  the  complex  cepstrum  and  obscures  much  of  the 
interesting  information  in  $. 
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Fig.  20.  (a)  Linear  phase  component  attributable  to  a  factor  W  °. 
(b)  IDFT  of  (a). 


It  is  possible  to  remove  this  component  before  computing  the  complex  cepstrum. 
Clearly,  we  could  simply  subtract  the  sequence  in  Fig.  20a  from  arg  [X(k)]  before  com¬ 
puting  the  complex  cepstrum.  This  car.  be  done,  since  we  note  that 


[*(?-')] 


m  »  - 
o 


if  we  have  sampled  the  phase  curve  at  a  high  enough  rate.  Thus  we  can  compute  the 

right-  hand  side  of  (91)  and  round  off  the  result  to  the  nearest  integer  to  obtain  m  . 

o 

Removing  the  linear  phase  component  in  arg  [X(k>  is  equivalent  to  removing  the 
-km  km 

factor  W  °  in  (80)  by  multiplying  by  W  °.  This  in  turn  is  equivalent  to  rotating 
(since  all  integers  are  taken  modulo  N)  the  input  sequence  x(n)  to  the  left  by  m  posi- 
tions.  This  fact  may  be  used  at  the  output  of  the  inverse  characteristic  system  to 
reposition  the  output  sequence  when  this  is  necessary. 

As  an  example,  let  us  consider  the  phase  curves  of  Fig.  19.  We  see  that  we  can 
compute  Figs.  19b  and  19c  by  using  the  methods  discussed  previously.  If  we  add  these 
two,  we  obtain  Fig.  19a  which  contains  the  undesirable  linear  component.  On  the  otner 
hand,  if  we  first  note  that 


m  a  - 
o 


(92) 


ARG 


[x(f-l)]*c°fl( 


7T 

then  we  could  subtract  ®(k)  from  COR  (k)  before  adding  it  to  ARG  [X(k)|.  This  is 

shown  in  Fig.  21  wherein  we  have  repeated  Fig.  19b  as  Fig.  21a.  Figure  21a  shows  the 

resulting  correction  for  this  example,  and  Fig.  21  c  shows  the  sum  of  Fig.  21a  and  21b. 

The  result  is  the  phase  curve  of  the  rotated  sequence  whose  values  are  :t(n+m  ),  where 

n  +  m  is  taken  modulo  N. 
o 


Fig.  21. 

(a)  Typical  principal  value  curve,  (b)  Cor¬ 
rection  curve  for  obtaining  arg  [X(k)]  and 
at  the  same  time  removing  the  linear  phase 
component,  (c)  Result  of  adding  (a)  to  (b). 


Let  us  summarize  the  results  of  the  previous  discussion.  Our  procedure  in  com¬ 
puting  the  phase  is  as  follows. 

(Al)  Compute  the  principal  value  ARG  [X(k)]  from  the  DFT  X(k)  for  k  =  0,  1,  .  .  . , 
N-l . 

(A2)  Compute  the  step  correction  function  COR  (k)  for  k  =  0,  1 . N-l  using 

Eqs.  88. 

(A3)  Determine  how  far  to  the  left  the  sequence  should  be  rotated  by  using  Eq.  92. 

(A4)  Subtract  ®{k)  as  defined  in  Eqs.  89  from  COR  (k)  for  k  =  0,  1 . N-l. 

(A5)  Add  the  result  of  (A4)  to  ARG  [X(k)j  for  k  =  0,  1,  ....  N-l  to  obtain  the  phase 
for  the  rotated  sequence. 
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If  one  has  enough  computer  storage  for  an  extra  table  of  N  values,  we  recommend 
that  each  step  be  done  for  all  values  of  k  before  proceeding  to  the  next  step.  This  i3 
because  of  the  differences  in  the  sizes  of  the  values  of  the  three  sequences  that  we  are 
adding  or  subtracting.  The  maximum  value  of  ARG  [X(k)]  is  clearly  rr.  The  maximum 
magnitude  of  COR  <k)  and  of  ©(k)  is  however,  the  maximum  magnitude  of  the  dif¬ 

ference  of  the  two  is  generally  much  less  than  moTr. 

If  storage  is  not  available  for  an  extra  table,  we  can  accomplish  the  same  result  by 
essentially  doing  {A2)  twice.  We  can  see  from  Fig.  19  that  the  value  COR  (N-l)  is 
equal  to 

COR  =  2*(#NJ-#PJ), 

where  #PJ  is  the  number  of  positive  jumps,  and  #NJ  is  the  number  of  negative  jumps  in 
the  interval  0  «k  <  N/2.  Thus  we  can  compute  mQ  from  Eq.  92  and  form  COR  (k)  -  ©(k) 
as  it  is  added  to  ARG  [X(k)j  and  then  store  the  result  in  the  same  location. 

3.5  OTHER  COMPUTATIONAL  CONSIDERATIONS 

We  shall  discuss  some  simple  techniques  that  can  be  used  to  minimize  the  required 

computation  time  for  realizing  the  characteristic  system  and  its  inverse. 

15 

Gentleman  and  Sande  summarize  some  of  the  useful  properties  of  the  DFT.  Many 
authors  refer  to  the  symmetries  inherent  in  the  DFT  relationships.  For  example,  we 
find  that  if  x(n)  is  real,  then  the  real  and  imaginary  parts  of  the  DFT 

X{k)=Xr(k)+jX.(k), 


have  the  properties 

Xr(k)  =  Xr(N-k) 

193a) 

X.(k)  =  -X.(N-k) 

(93b) 

for  k  =  0,  1,  . . .  ,  N-l  and  k  and  N-k  taken  modulo  N.  Thus  we  say  that  the  real  part  of 
X(k)  is  an  even  function  of  k,  and  imaginary  part  of  X(k)  is  an  odd  function  of  k. 

Let  us  consider  a  sequence  whose  values  are  purely  imaginary  such  as 

x(n)  =  jq(n)  for  n  =  0,  1 . N-i, 

t+t  **•» 

where  the  q(n)  are  real  numbers.  The  resulting  DFT  would  be 
X(k)  =  jQ(k)  =  -Q  (k)  t  jQ  Ik), 

m  w  A  *%*  i 

where 

Q(k)  =  Q  <k)  +  jQ.(k) 


7 

a 


i 

% 


it 


t; 


J 


s* 
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is  the  DFT  of  the  real  sequence  q  whose  values  are  q(n).  Therefore,  if 

x(n)  +  p(n)  +  jq(n), 

«•* 

where  p  and  q  are  both  finite  length  sequences,  then  we  see  that 
Xfk)  =  P  (k)  -  Q  (k) 

rwl 

X  (k)  =  P.(k)  +  QJk). 

Because  of  the  symmetry  properties,  we  see  that  the  following  relations  are  true, 

P{k)  -  P  <k)  +  jP.(k)  =  Ev  [X  (k)J  +  j  Odd  [X  (k)j  (94a) 

Q(k)  =  Q  (k)  +  jQ.(k)  =  Ev  [X  (k)]  -  j  Odd  [XJk)].  (94b) 

where,  for  example, 

X  <k)  +  X  JN-k) 

Ev  [Xr(k)J  =  ^ ^ -  k  =  0,  1 . N-l  (95a) 

X  (k)  -  X  (N-k) 

Odd  [Xr(k)]  =  ^ ^ -  k  =  0,  1 . N-l.  (95b) 

From  these  relationships,  several  comments  are  in  order. 

Comment  1:  For  real  sequences,  if  we  are  given  X_(k)  for  k  =  0,  1,  ....  N/2,  and  X.(k) 
for  k  =  1,  2,  . . .  ,  N/2-1,  then  we  can  use  Eqs.  93  to  determine  Xr(k)  and  X^k)  for  all  k. 
This  fact  can  be  used  tc  conserve  memory  when  we  must  store  several  complex  tra'S- 
forms  for  some  reason. 

Comment  2:  We  can  obtain  the  DFT  of  two  sequences  p  and  q  by  using  only  one  evalu¬ 
ation  of  the  DFT  relationships  by  evaluation  of 


N-l 

X(k)  =  )  (p(n)+jq(n))  W_kn  =  P(k)  +  jQ(k). 

n=0 

V/e  have  seen  that  Eqs.  94  and  95  can  be  used  to  recover  P(k)  and  Q(k)  from  X(k).  Com¬ 
ment  1  can  be  applied  here  to  allow  us  to  store  the  transforms  P(k)  and  Q(k)  in  2N  loca- 

«**  '*■' 

tions,  rather  than  in  the  4N  locations  required  to  store  all  values  of  these  transforms. 
Comment  3:  In  computing  |X(k)|2,  4  log  |X(k)|2,  ARG  fX(k)],  COR  (k),  and  arg  fX(k)j, 

-i  1  -  -  in'  M  b 

we  recognize  that  each  of  these  are  even  or  odd  so  that  we  only  need  compute  N/2  appro¬ 
priate  values  of  the  sequences,  and  then  we  can  find  the  rest  of  the  values  by  symmetry. 
Thus  we  can  save  almost  one  half  of  the  time  in  performing  these  operations  on  X(k). 


60 


3.6  COMPUTATION  TIME  REQUIREMENTS 

We  have  shown  the  numerical  techniques  that  must  be  used  to  carry  out  the  trans¬ 
formations  D  and  D_1.  A  major  consideration  is  the  amount  of  time  required  for  these 
operations.  We  shall  provide  rough  guidelines  for  estimating  the  computation  time. 


Fig.  22.  Computational  realization  of  the  characteristic  system. 

The  operations  required  for  the  transformations  D  and  D_1  are  summarized  in 
Figs.  22  and  23.  In  these  figures,  all  operations  are  performed  on  either  one  or  two 
tables  of  real  numbers.  The  length  of  these  tables  is  N,  which  is  a  power  of  two.  As 
we  can  see,  these  numbers  are  combined  together  in  various  ways  to  obtain  the  oper¬ 
ations  of  FFT,  log,  ARG,  etc.  These  numerical  operations  consist  of  additions,  sub¬ 
tractions,  multiplications,  and  divisions,  coupled  together  by  logical  operations  and 
indexing  through  the  tables.  Most  of  the  total  computation  time  is  due  to  the  arithmetic 
operations. 


Fig.  23.  Computational  realization  of  the  inverse  characteristic  system. 


In  realizing  the  transformations  D  or  D*1,  we  require  2  FFT's.  We  have  stated  that 
the  number  of  complex  additions  and  multiplications  is  equal  to  N  log2  N.  The  remaining 
operations;  log,  ARG,  arg,  exp,  etc.  all  require  a  number  of  additions  and  multiplica¬ 
tions  proportional  to  N.  Thus  it  is  reasonable  to  write 


TD  =  2TFFTN  log2  N  +  TADN 

and 

T  .  =  2T„„tN  log.  N  +  T  ,  N, 

D_I  FFT  2  AD'1 


*96) 

(V) 
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Tq  =  time  required  to  compute  the  complex  cepstrum 


T  .  =  the  time  required  to  compute  the  output 
D 

Tpp„pN  log2  N  =  the  time  required  to  compute  the  F  ?T 

-  the  time  required  to  compute  the  log  magnitude  and  the  phase  curve 

A 

T  .  N  =  the  time  required  to  exponentiate  the  transform  of  y. 

AD 

For  one  realization  on  the  TX-2  computer,  we  obtained  the  values 

TpFT  =  6°  Kiec 

Tad  =  9  msec 

T  i  =  6  msec. 

AD”1 

Thus,  for  exunple,  for  N  =  4096,  we  obtain 

Td  =  40  sec 

T  .  =  30  sec 
D 

We  should  point  oui  that  Eqs.  96  and  97  would  be  divided  by  2  if  we  exploit  all  of  the 
symmetry  properties. 


3.7  MINIMUM- PHASE  COMPUTATIONS 


We  have  discussed  certain  simplifications  that  occur  when  the  input  sequence  is 
minimum-phase.  As  we  have  seen,  for  minimum-phase  finite-length  sequences,  all 
zeros  of  X(z)  are  inside  the  unit  circle,  and  we  have  as  many  poles  at  zero  as  we  have 
nonzero  zeros.  We  have  shown  that,  in  this  case,  a  recursive  formula  can  be  derived 
for  x(n)  and  we  can  also  compute  x<n)  from  log  lx(e^w)|  alone  v/ithout  computing  the 
phase.  We  shall  now'  consider  the  actual  computational  realisation  of  this  second  method 
and  compare  it  with  the  recursive  method. 

We  have  seen  that  the  samples  of  log  |  X(e‘'w)|  may  be  computed  by  first  calculating 

N-l 

X(k)  =  X  (k)  +  jX.(k)  =  )  x(n)  W*kn 

n=0 


for  k  =  0,  1 . N-l.  *  We  use  the  FFT  algorithm  to  evaluate  these  eq  ^tions.)  We  may 

use  a  polynomial  approximation  to  compute 


«r<k)  =  log 


!  x(k){  - i  log 


X^(k)  +  X*(k) 


for  k  =  0,  1,  . 


N-l,  as  we  normally  d^  even  if  the  phase  is  to  be  used.  Because  of 


f 

! 

t 


t 

i 

I 


the  periodicity  of  x  and  x,  the  even  part  of  x  is 
x(n)  +  x(N-n) 

Ev  [x(n)j  =  ~ - f - .  (««; 

A 

where  n  and  N-n  are  taken  modulo  N.  Since  X^(k)  is  an  even  function,  its  IDFT  is 
Ev  [x(n)]  so  that 

N-l 

Ev  [x(n)J  =  {  ^  {log  \y£( k)  +  X*{k)l  Wkn  (99) 

k=0 

for  n  =  0,  1,  . .  .  .  N-l .  We  note  that  x(n)  =  0  for  n  <  0  for  minimum- phase  sequences. 
We  have  also  seen  that  xf  x(n)  =  0  for  n  >  N,  then  no  aliasing  will  occur  in  $(n),  that  is, 

x(n)  =  x(n)  fox'  n  =  0,  1,  ....  N-l. 

We  note  that  if  x(n)  =  0  for  n  5  N/2  and  n  <  0,  it  is  clearly  true  that 

x(n)  =  x(n)  for  n  =  0,  1 . N-l. 

Furthermore,  we  see  from  (98)  that  in  this  case  we  can  obtain  x(n)  from  Ev  [x(n)j  by 
using  the  relation 

x(n)  =  Ev  f£(n)]  u(n),  (100) 

where 


u(n)  =  l  n  =  0 

N 

=  2  0  <  n 

=  0  N  <  N.  (101) 

If  x(n)  *  0  for  n  >  N/2,  as  is  usually  the  case,  we  ''an  still  use  Eqs.  99  and  100  to 
compute  an  aliased  approximation  to  x(n).  Clearly,  the  more  rapidly  x(n)  approaches 
zero,  for  large  positive  and  negaiive  values  of  n,  the  better  will  be  the  approximation 
for  a  given  value  of  N.  Alternatively,  the  larger  we  make  N,  the  better  will  be  the 
approximation  tc  x(n).  In  the  case  of  minimum-phase  sequences,  if  we  exponentially 
weight  the  input  so  that 

w(n)  -  onx(n), 

then  the  complex  cepstrum  is  also  exponentially  weighted  by  the  same  exponential.  That 
is,  if  |  a  |  <1, 

w(n)  =  a^(n). 


k 
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Therefore  exponential  weighting  can  be  used  tc  help  reduce  the  aliasing  of  the  complex 
cepstrum. 

It.  summary,  we  can  compute  an  aliased  complex  cepstrum  when  x  is  minimum- 
phase  by  using  the  following  equations  which  are  the  counterparts  of  Eqs.  60-62. 


I 


! 

I 

! 

I 


N-l 

X(k)  =  ^  x(n)  W-kn  k  =  0,  1, ,  N-l  (102a) 

n=0 


N-l 

Ev  f$(n)j  =  £  log  |  X(k)|  W^11  n  =  0,  1 . N-l,  (102b) 

k=0 


x(n)  =  Ev  [x(n)J  u(n). 


()  02c) 


We  have  defined  u(n)  in  (101)  and  we  recall  that  W  =  e 

As  an  alternative,  of  course,  we  have  Eqs.  44.  These  equations  are  repeated  below 
for  the  case  x(n)  =  0  for  n  <  0  and  n  >  M. 


x(n)  =  log  x(0)  n  =  0 


x(n)  r‘  k  x(n-k) 

- )  £$(k) - 

x(0)  £0  n  x(0) 


0  «  n  M 


x(n)  V’  k  x(n-k) 
- )  ££<k) - 

x<°>  k=n-M 


M  <  n. 


It  is  clear  that  the  recursive  algorithm  has  the  advantage  over  Eqs.  102,  in  that  no 
aliasing  is  introduced.  Furthermore,  to  compute  M  samples  of  $(n)  we  require  only 
2M  data  storage  locations.  The  recursive  algorithm,  however,  suffers  greatly  when 
compared  with  Eqs.  102  on  the  basis  of  computation  time.  If  we  exploit  all  the  sym¬ 
metry  properties  in  carrying  out  Eqs.  10'  as  discussed  in  section  3.6,  we  find  that 
for  values  of  M  greater  than  64,  the  evaluation  of  Eqs.  102  for  N  =  2M  is  faster  than 
evaluating  M  points  by  using  the  recursive  algorithm.  This  is  an  estimate  based  on  a 
third-order  approximation  to  the  logarithm  and  the  assumption  that  a  multiplication  takes 
twice  as  long  as  addition.  In  cases  in  which  a  higher  order  approximation  to  the  loga¬ 
rithm  is  required,  or  a  multiplication  takes  longer  than  twice  the  time  for  an  addition, 
this  crossover  point  will  be  higher,  but  not  significantly  higher.  Therefore  for  minimum- 
phase  sequences  we  find  that  even  though  we  have  obtained  a  recursive  algorithm  in 
which  aliasing  is  not  a  problem  it  is  usually  preferable  to  use  Eqs.  102  with  N  such  that 
aliasing  is  not  significant.  The  recursive  algorithm  is  still  useful  conceptually,  and  it 
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may  find  application  when  only  a  small  number  of  values  of  $(n)  is  required  or  when 
aliasing  cannot  be  tolerated. 

3.8  SAMPLING  OF  CONTINUOUS- TIME  SIGNALS 

It  can  be  shown  that  in  the  case  of  continuous -time  signals  whose  Fourier  transforms 
are  bandlimited,  the  samples  of  a  continuous  time  convolution  are  equal  to  the  discrete 
convolution  of  the  samples  of  the  individual  signals.  That  is,  if  time  and  frequency  are 
normalized  so  that  the  Nyquist  frequency  is  1  Hz,  then 

00 

x(n)  =  J  Xj(t)  x2(n-T)  dT  =  ^  Xj(k)  x2(n-k).  (103) 

k=  oo 

In  our  calculations,  we  have  assumed  that  all  sequences  are  of  finite  length.  This 
assumption  would  imply  that  the  corresponding  continuous  time  function  is  time-limited. 
It  is  well  known  that  a  time-limited  signal  cannot  have  a  frequency-limited  Fourier 
transform.  In  practice,  however,  the  Fourier  transform  approaches  zero  quite  rapidly 
in  most  cases,  even  when  the  signal  is  time-limited.  Therefore  we  shall  use  the  approx¬ 
imation  that  both  the  continuous  time  signal  and  its  transform  are  zero  outside  of  some 
finite  interval. 

Let  us  consider  a  continuous-time  function  x(t)  whose  Fourier  transform  Xc(w)  is 
such  that 

X  (w>  =  0  |  Cjj  >  TT. 

The  Fourier  transform  of  the  sequence  of  samples  of  x(t)  is 
X(e^w)  =  X  (w)  for  -ir  <  w  <  v 

and  X(e*w)  is  periodic  with  period  equal  to  2rr. 

Such  an  example  is  shown  in  Fig.  24.  There  we  see  that  if  we  sample  exactly  at  the 
Nyquist  rate,  we  encounter  no  aliasing  of  X(e^w),  and  |  X(e^w)|  is  nonzero  for  all  w.  This 
means  that  log  |X(e^w)j  will  be  finite  for  all  w.  This  is  shown  in  Fig.  25a.  On  the  other 
hand,  if  we  sample  at  a  rate  higher  than  the  Nyquist  rate  as  in  Fig.  24c,  we  see  that 
X(e^u)  will  be  zero  over  a  finite  interval,  so  that  log  j  X(e^w)|  will  be  undefined  in  that 
interval.  This  is  shown  in  Fig.  25b. 

The  previous  example  is  idealized;  however,  the  essential  points  remain  true  in 
practice.  If  we  are  dealing  with  a  finite-length  sequence  that  is  the  result  of  sampling 
some  continuous-time  signal,  we  normally  would  try  to  sample  at  a  rate  that  is  as  high 
as  possible  so  as  to  avoid  aliasing  in  the  Fourier  transform.  Since  the  high-frequency 
content  of  many  signals  (speech  or  seismic  signals)  is  quite  low,  we  shall  normally  find 
that  if  the  sampling  rate  is  high,  xhere  is  a  significant  interval  of  frequency  over  which 
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Fig.  25.  (a)  Log  magnitude  far  Fig.  24b. 

(b)  Log  magnitude  tor  Fig.  24c. 


X(e^w)  is  very  small.  This  means  that  if  we  compute  the  sampled  transform 

N-l 

X(k)  =  X  <k)  +  jX.(k)  =  Y  x(n)  W"kn  k  =  0,  1 . N-l . 

n=0 

then  both  X  (k)  and  X.(k)  will  normally  be  quite  small  for  values  of  k  around  N/2.  Since 
we  must  compute  both  log  |X(k)|  and  ARG  [X(k>],  using  polynomial  approximations 
involving  divisions,  ve  shall  normally  encounter  severe  accuracy  problems  when  both 
X  (k)  and  X.(k)  are  very  small.  This  is  especially  true  in  using  fixed-point  machines 
in  which  we  must  effectively  keep  the  scale  factor  of  X  (k)  and  X.(k)  the  same  for  all 
values  of  k.  Similarly,  the  scale  factor  must  be  the  same  for  log  |X(k)|  and  arg  [X(k)J 
for  all  values  of  k.  Therefore  it  is  clear  that  in  sampling  a  continuous-time  signal  so 
as  to  apply  discrete  homomorpnic  deconvolution,  we  must  effectively  sample  as  in 
Fig.  24b,  rather  than  as  in  Fig.  24c.  In  most  cases  we  can  usually  lowpass  filter  the 
signal  before  sampling.  Therefore  if  we  precede  the  sampler  with  a  very  sharp  cutoff 
filter,  and  then  sample  at  a  little  more  than  twice  the  nominal  cutoff  frequency  of  the 
filter,  we  shall  obtain  a  sequence  in  which  aliasing  is  not  excessive  in  the  Fourier 
transform,  while  at  the  same  time  we  shall  make  it  possible  to  accurately  compute 
log  |X(k)j  ana  arg  |X(k>]  for  all  values  of  k. 


67 


IV.  ANALYSIS  OF  SPEECH  WAVEFORMS 


We  have  shown  that  it  is  possible  to  transform  a  convolution  of  two  or  more  signals 
into  a  sum  of  related  signals.  We  have  seen  that  in  some  cases  these  signals  are  sepa¬ 
rated  in  "time"  so  that  a  frequency-invariant  linear  system  can  be  used  to  advantage  in 
separating  the  signals  from  one  another. 

We  shall  now  discuss  a  simple  model  for  the  production  of  speech  waveforms.  This 
model  results  in  a  representation  of  certain  segments  of  speech  waveforms  as  a  con¬ 
volution  of  an  aperiodic  pulse  with  a  quasi-periodic  impulse  train.  Thus  speech  wave¬ 
forms  are  examples  of  the  class  of  signals  to  which  this  technique  of  deconvolution  is 
particularly  applicable. 

Our  purpose  in  discussing  speech  is  twofold.  First,  we  shall  see  that  speech  wave¬ 
forms  serve  as  very  interesting  examples  of  the  techniques  presented  in  this  report.  Sec¬ 
ond,  Section  V  will  focus  on  the  problem  of  echo  removal  and  detection.  As  a  particular 
example  of  the  application  of  the  results  of  Section  V,  we  shall  discuss  the  removal  and 
detection  of  echoes  in  speech  signals.  Thus  we  shall  also  obtain  an  understanding  of  the 
characteristics  of  the  complex  cepstrum  of  speech  which  will  be  applied  in  another  con¬ 
text  in  Section  V. 

19  20 

The  application  of  homomorphic  deconvolution  to  speech  analysis  by  Oppenheim  ’ 
has  paralleled  our  application  to  echo  removal.  This  section  will  attempt  to  give  a  brief 
introduction  to  this  field  of  application.  The  reader  who  is  interested  m  this  is  directed 

lo  20 

to  Oppenheim  and  Schafer  and  Oppenheim  for  more  detail.  We  shall  first  discuss 
a  model  for  the  speech  waveform  and  then  consider  the  complex  cepstrum  of  speech.  We 
shall  give  some  examples  illustrating  the  applicability  of  our  techniques  to  the  recovery 
of  the  separate  components  of  the  speech  waveform. 

4.  1  SPEECH  PRODUCTION  AND  THE  SPEECH  WAVEFORM 

The  speech  signal  is  an  acoustic  disturbance  that  is  generated  by  air  escaping  from 
the  lungs  of  the  speaker.  The  mouth  and  throat  form  an  acoustic  resonator  called  the 
vocal  tract  which  is  excited  by  the  air  that  is  supplied  by  the  lungs.  We  may  think  of 
the  lungs  as  a  source  of  a  steady  flow  of  air  which  is  converted  into  a  varying  flow  either 
by  the  vocal  cords  or  by  constrictions  of  the  vocal  tract.  In  the  first  case,  the  vocal 
cords  may  convert  the  steady  flow  of  air  into  a  series  of  quasi-periodic  pulses  by  rapidly 
opening  and  closing  the  passage  to  the  lungs.  Sounds  generated  in  this  way  are  called 
voiced  speech  sounds.  Examples  are  the  vowel  sounds.  The  other  principal  class  of 
speech  sounds  is  termed  unvoiced  and  these  are  generated  by  creating  constrictions  in 
the  vocal  tract  which  cause  turbulence  to  occur  at  these  points.  Many  of  the  consonants 
are  generated  by  using  this  type  of  "noise"  signal  as  excitation  for  the  vocal  tract. 

In  Fig.  26a,  we  see  a  schematic  representation  of  voiced  speech  production.  The 
lungs  supply  a  steady  flow  of  air  that  is  modulated  by  the  vocal  cords  to  give  the  excita¬ 
tion  function  e(t)  as  shown  in  Fig.  26b.  The  vocal  tract  is  modeled  by  a  cascade  of  linear 
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resonators  whose  combined  impulse  response  v(t)  is  typically  like  Fig.  26c.  The  reso¬ 
nant  frequencies  of  the  resonators  are  called  the  formant  frequencies.  The  speech 
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Fig.  26.  (a)  Model  of  voiced-speech  pro¬ 
duction.  (b)  Excitation  function 
generated  by  vocal  cords,  (c)  Im¬ 
pulse  response  of  vocal  tract, 
(d)  Resulting  speech  waveform. 


signal  is  the  response  of  the  vocal  tract  to  the  quasi-periodic  excitation  e(t),  and 
therefore  is  given  by 

s(t)  =  y  v(t)  e(t-x)  dT. 

Clearly,  the  resulting  speech  signal  is  also  quasi-periodic.  Since  e(t)  is  quasi-periodic, 
we  could  further  represent  e(t)  as  the  convolution  of  a  basic  pulse  shape  g(t)  (called  the 
glottal  pulse)  with  an  impulse  train  p(t)  such  that 

e(t)  =  g(t)  ®  p(t). 

Therefore  we  obtain 

s(t)  =  v(t)  ®  g(t)  ®  p(t). 

In  the  case  of  unvoiced  sounds,  the  vocal  tract  is  effectively  excited  by  noise  which  is 
not  pulselike  or  quasi-periodic.  A  similar  model  can  be  used,  however,  with  the  exci¬ 
tation  being  a  noise  source. 

We  have  seen  that  speech  can  be  thought  of  as  a  convolution  of  two  or  more 
continuous-time  signals.  In  general,  the  Fourier  transforms  of  the  separate  components 
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of  speech  are  bandlimited  {with  the  exception  of  the  impulse  train).  Thus  it  is  reasonable 
to  assume  that  the  samples  of  the  speech  waveform  can  be  obtained  from  discrete  con¬ 
volutions  of  the  samples  of  the  individual  components.  Thus,  a  reasonable  model  for 
the  samples  of  a  segment  of  voiced  speech  is 

s{n)  =  v(n)  ®  g(n)  ®  p{n), 

where  p(n)  is  a  sequence  that  is  approximately  of  the  form 


6{n-rnp), 


where  np  corresponds  to  the  "pitch"  of  the  sound. 


4.  2  SHORT-TIME  TRANSFORM 


Speech  production  can  be  modeled  as  we  have  just  discussed,  but  implicit  in  our  dis¬ 
cussion  was  the  fact  that  the  character  of  the  speech  waveform  changes  as  time  proceeds. 
That  is,  human  speech  is  a  string  of  sounds  that  are  continuously  changing.  Thus  the 
model  that  we  have  discussed  must  be  time -variant  in  the  sense  that  both  the  excitation 
function  and  the  vocal-tract  impulse  response  change  as  a  function  of  time. 

The  fact  that  the  character  of  the  speech  waveform  changes  with  time,  together  with 
the  fact  that  for  reasonably  good  quality  speech  we  need  at  least  10,000  samples  per  sec¬ 
ond  to  represent  the  waveform,  requires  that  we  adjust  our  notions  about  Fourier  anal¬ 
ysis.  We  find  that  it  is  neither  possible  or  desirable  to  3peak  of  the  Fourier  transform 
of  even  an  entire  sentence,  let  alone  the  transform  of  longer  segments.  Rather,  the 
notion  of  a  short-time  Fourier  transform  is  more  appropriate. 

Suppose  we  are  interested  in  a  segment  of  the  speech  waveform  in  the  vicinity  of 
n  =  £.  (The  time  origin  is  arbitrary.)  We  define  the  short-time  z-transform  as 

L^> 

S(z,£)  =  2  s(n+£)  w(n)  z  (104) 

n=0 

In  Eq.  104,  w(n)  is  a  "window"  containing  L  nonzero  samples.  We  "view"  the  speech 
signal  through  this  window  by  changing  the  parameter  £,  in  order  to  move  the  segment 
of  interest  under  the  window.  The  short-time  Fourier  transform  of  the  sampled  speech 
is  obtained  by  letting  z  =  e^w,  that  is, 

L-l 

S(ej“U)  =  Z  8{n+l>  W(n>  e"jWn-  (105) 

n=0 

Since  S(e^w,  £)  is  really  just  the  transform  of  a  sequence  of  finite  length,  all  of  the  prop¬ 
erties  of  Fourier  transforms  of  such  sequences  apply  to  (105).  We  note  that  the 
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sequence  s(n+ 1)  w(n)  can  be  recovered  by  using  the  inverse  Fourier  transform  relation 

s(n+£)  w(n)  =  jz  \  S{e  ,  i)  dw.  (106) 

In  speech  analysis,  the  length  of  the  window  is  usually  comparable  to  the  duration  of 
the  shortest  speech  sounds.  We  note  that  S(ejw,  i)  is  also  equal  to  the  convolution  of  the 
transform  of  s(n+|)  with  the  transform  of  w(n).  Therefore  if  one  requires  good  fre¬ 
quency  resolution,  longer  windows  may  be  required.  This  is  the  case  in  the  echo 
removal  and  detection  applications  for  which  we  shall  see  that  the  window  must  be  quite 
long  compared  with  the  temporal  detail  of  the  signal.  When  we  are  interested  in  speech 
analysis,  however,  we  find  that  we  must  make  the  window  only  a  few  pitch  periods  long 
so  that  the  character  of  the  speech  signal  remains  esse...*,  ily  the  same  within  the  win¬ 
dow.  Typically,  the  duration  of  the  window  in  this  case  is  equal  to  r  number  of  samples 
equivalent  to  20-*0  msec  of  speech.  At  a  10-kHz  sampling  rate  this  m  from  200  to  400 
samples. 

For  speech  analysis,  we  have  found  the  so-called  Hamming  window  to  be  quite  useful, 
because  of  it3  good  frequency  resolution  properties.  The  discrete  Hamming  window  is 

w(n)  (l  "-cos  ~  n)  0  n  <  L 

1  V  '  (107) 

=  0  elsewhere 

For  echo  removal  and  detection,  we  shall  see  in  Section  V  that  a  truncated  exponential 
window  of  the  form 

win)  a  <zn  0  «*  n  <  L 

=  o  elsewhere 

has  properties  that  are  very  much  suited  to  that  application. 

4.3  SHORT-TIME  COMPLEX  CEPSTRUM  OF  SPEECH 

We  have  seen  that  speech  may  be  modeled  as  a  convolution  if  we  consider  short  seg¬ 
ments  of  the  waveform.  We  have  also  introduced  the  notion  of  a  short-time  transform 
with  an  appropriate  window.  We  shall  now  discuss  how  short-time  ‘ransforms  may  be 
used  to  obtain  a  short-time  realization  of  homomorphic  deconvolution. 

Let  us  consider  a  segment  of  voiced  speech  that  is  multiplied  by  a  window;  that  is, 

s‘nH)  w(n)  =  [vg(n)  ®  pjn+^j  w{n),  (108) 

where  vg(n)  =  v(n)  ®  g(n).  (We  could,  of  course,  assume  without  loss  of  generality  that 
the  segment  of  interest  occurs  at  %  -  o.)In  practice,  the  component  vg(n)is  of  finite  dura¬ 
tion,  and  we  shall  assume  that  it  remains  the  same  over  the  entire  window.  We  observe 
that  the  sequence  whose  values  are  s(n+§)  w(n)  for  0  <  n  <  L,  however,  is  not  strictly  a 
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convolution.  It  should  be  clear  that  this  is  so  because  the  "tails"  of  previous  periods 
overlap  into  the  wmaow  and  the  "tails"  of  periods  within  the  window  are  truncated  by  the 
window.  Nevertheless,  under  appropriate  conditions  this  sequence  will  be  approximately 
a  convolution  so  that  the  short-time  transform  for  such  a  segment  will  be  approximately 
the  product  of  the  transform  of  vg(n)  and  the  transform  of  a  segment  of  p(n).  This  is 
best  illustrated  as  follows. 

Let  us  assume  that  vg  has  a  duration  of  not  more  than  two  pitch  periods.  We  shall 
assume  that  the  window  varies  slowly  so  that  w(n)  »  w(n+2np),  where  n^  is  the  pitch 
period.  Then  we  may  write  Eq.  108  as 

s(n+4)  w(n)  =  vg(n)  ®  Pw(n,  4)  +  e(n,  4), 

where  e(n,  4)  accounts  for  the  overlaps  at  the  beginning  and  the  end  of  the  window,  and 
Pw(n,  i)  =  p(n+4)  w(n). 

Thus  we  have  assumed  that  the  window  is  such  that  vg  remains  essentially  the  same 
across  the  whole  window,  with  the  pitch  pulses  being  the  only  part  that  is  weighted  by  the 
window.  If  we  evaluate  the  short-time  transform  we  obtain 


S(ejw,  4)  =  VG(eJw)  Pw(eiu,  4)  +  E(eJu\  4). 


Therefore  we  see  that  only  if  E(e^u,  4)  is  negligible,  is  S(e^w,  4)  simply  a  product.  If  it 
is  true  that  E(e^w,  4)  is  negligible,  however,  we  see  that 


S(eJw,  4)  =  log  [S(eJW,  4)]  *  log 


Jw 


Pw(eJw,4) 


+  log  [VG(ejw)]. 


Windows  that  taper  to  zero  at  the  ends  are  quite  effective  in  minimizing  the  end 
effects. 

We  define  the  short-time  complex  cepstrum  to  be 

s{n,  4)  =  J  log  [Sfe^,  4)]  e^wn  dw  «  Pw(n,  4)  +  vg{n). 

Thus  the  short-time  complex  cepstrum  is  the  sum  of  p  (n,  4),  which  contains  the  pitch 
information  in  the  interval  spanned  by  the  window,  and  the  complex  cepstrum  of  vg.  The 
latter  can  actually  be  thought  of  as  the  sum  of  two  components,  one  that  is  due  to  the 
vocal  tract  and  the  other  to  the  glottal  pulse.  The  impulse  response  of  the  vocal  tract 
is  minimum  phase;  however,  the  glottal  pulse  is  not.  Therefore  vg(n)  will  be  nonzero 
for  both  positive  and  negative  values  of  n.  Nevertheless,  it  is  clear  from  the  aperiodic 

A 

nature  of  vg,  that  the  component  vg(n)  will  tend  to  be  concentrated  around  n  =  0  and  will 
be  bounded  by 

|vg(n)|  <  for  all  n. 
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On  the  other  hand,  p  (n,  £)  is  an  impulse  train  which  m  general  will  give  an  impulse 
tram  in  the  complex  cepstrum,  with  the  impulses  occurring  at  multiples  of  the  pitch 
period.  Thus  we  have  a  situation  in  which  the  complex  cepstrum  can  be  divided  into  two 
regions,  each  corresponding  to  a  different  component  of  the  speech  signal.  To  recover 
the  component  pw(n,  £),  we  zero  those  samples  of  the  complex  cepstrum  for  J n  [  <  n^,  and 
then  employ  the  inverse  of  the  characteristic  system  to  obtain  pw(n,  £}.  To  recover  vg, 
we  replace  with  zero  those  samples  for  |n|  s  n^,  and  transform  the  result  with  the 
inverse  system.  Examples  of  this  are  given  in  section  4.  4. 

If  the  speech  segment  under  consideration  is  unvoiced,  the  ahort.-iime  complex  cep¬ 
strum  does  not  have  the  impulse  train  component  as  in  the  voiced  speech.  Thus,  the 

comolex  cepstrum  can  be  used  as  a  pitch  detector  and  voiced/unvoiced  detector  for 

11  12 

vocoder  and  speech  analysis  applications.  Noll  '  has  shown  tha-  the  cepstrum  is 
very  successful  in  this  application,  (The  cepstrum  as  defined  by  others  is  essen. ially 
the  even  part  of  the  complex  cepstrum,)  It  appears  that  the  complex  cepstrum  may  offer 
advantages  over  the  cepstrum,  since  it  is  possible  to  actually  extract  the  component 
p  (n,  i)  so  that  we  obtain  information  about  variations  in  pitch  across  the  window,  rather 
than  an  average  pitch  period  as  is  obtsi*-  U  :n  the  cepstrum.  The  reason  for  this  is  that 
if  the  pitch  is  not  constant,  the  impulse^  in  th^  „epstrum  and  complex  cepstrum  either 
become  smeared  out  around  some  ave^afto  pitch  period,  or  impulses  appear  at  longer 
times  than  the  fundamental  period.  The  following  examples  illustrate  this  point. 

4.4  EXAMPLES 

We  shall  first  consider  two  examples.  We  shall  consider  the  recovery  of  pitch,  and 
then  the  recovery  of  the  component  vg,  which  contains  the  impulse  response  of  the  vocal 
tract  and  the  glottal  pulse. 

As  an  example  of  the  recovery  of  pitch,  let  us  consider  the  segment  of  the  vowel  "ah" 
as  m  father,  shown  in  Fig.  27a.  This  waveform  was  sampled  at  a  10 -kHz  rate,  and  in 
Fig.  ?.7a  the  samples  have  been  connected  by  straight  lines  to  form  a  smooth  curve. 
(This  >s  the  case  in  all  of  the  curves  that  we  show  in  this  section.)  Note  that  the  pitch 
is  quite  constant  in  the  interval  shown. 

In  Fig.  27c  is  shown  the  complex  cepstrum  for  the  segment  of  Fig.  27a  when  weighted 
by  the  Hamming  window  of  Eq.  107,  where  L=  256  (25.  6  msec).  Since  the  Hamming 
window  is  very  small  in  value  at  its  ends,  it  tends  to  minimize  the  error  caused  by  over¬ 
lap.  We  note  significant  peaks  in  the  complex  cepstrum  at  appi  oximately  ±5  msec 
(i  J  samples).  This  is  the  period  of  the  waveform  in  Fig.  27a.  We  also  note  that  the 
complex  cepstrum  is  relatively  small  for  values  beyond  50  samples.  Figure  27b  shows 
the  output  of  the  inverse  characteristic  system  after  having  replaced  the  samples  tor 
jn|  ^  40  with  zero.  We  note  that  the  impulses  are  spaced  at  the  pitch  period  and 
are  weighted  by  the  shape  of  the  Hamming  window. 

As  an  examole  of  what  happens  wnen  the  pitch  varies  across  the  window,  we 
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replaced  the  sampler  of  Fig.  27c  with  zero  for  |n|  5  40.  Thus  the  pitch  was  removed, 
and  the  resulting  output  of  the  inverse  characteristic  system  was  a  pulse  containing 
the  glottal  pulse  and  vocal  tract  information.  This  pulse  is  shown  in  Fig.  29c.  This 
pulse  was  used  in  the  computer  to  synthesize  a  new  waveform  in  which  the  spacing 
of  impulses  in  p(n)  was  alternating  between  35  samples  and  40  samples.  This 
waveform  is  shown  in  F  ig.  28a.  (Such  pitch  fluctuations  have  been  reported  by 
NoU.12) 


(a) 


.  I _ I _ I _ I _ I _ 

-100  -50  0  50  100  n 

Fig.  27.  Pitch-extraction  example,  (a)  Segment  of  the  vowel  "ah."  (b)  The 
output  for  a  long-pass  system,  (c)  Complex  cepstmm  for  Hamming 
window. 


The  complex  cepstrum  for  Fig.  28a  weighted  by  a  Hamming  window  with  L  = 
256  samples  is  shown  in  Fig.  26c.  This  time,  we  note  that  there  are  significant 
peaks  at  n  =  -75,  -40,  +35,  and  +75.  The  values  of  the  complex  cepstrum  wr re 
replaced  by  zero  for  |n|  <32  and  the  resulting  output  of  the  inverse  characteristic 
system  is  shown  in  Fig.  28b.  It  is  clear  that  in  this  case,  the  output  of  the  over¬ 
all  system  shows  very  clearly  how  the  pitch  changes  in  this  time  interval,.  On  the 
other  hand,  things  are  not  so  clear  in  the  complex  cepstrum.  In  the  even  part  of 
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the  complex  cepstrum,  that  is,  the  cepstrum,  we  would  find  peaks  at  n  =  ^35,  ±40, 
and  ±74.  Thus  we  would  probably  have  to  be  content  with  ar  average  constant  pitch 
over  this  interval,  since  the  cepstrum  does  nc  tell  in  wiiat  order  the  long  and 
short  periods  occur. 

(o) 


Fig.  28.  Pitch  extraction  example,  (a)  Synthetic  speech  with  fluctuating  pa.'h, 

(b)  The  output  for  a  long-pass  system,  (c)  Complex  cepstrum  fo.  a 
Hamming  window. 

The  first  two  examples  indicate  that  the  use  of  the  complex  cepsti’um  may  be 
desirable  in  situations  in  which  we  are  interested  in  very  accurate  and  synchronous 
pitch  extraction.  This  remark  is  based,  however,  on  only  limited  experimentation 
and  can  only  be  verified  through  more  extensive  effort  in  this  direction. 

As  a  third  example,  we  note  that  in  obtaining  the  waveform  of  Fig.  28a  we 
recovered  a  pulse  from  the  complex  cepstrum  of  Fig,  27c.  This  pulse  is  shown 
in  Fig.  29c.  The  original  waveform  is  repeated  in  F:g.  29a  with  expanded  time  and 
amplitude  scales.  To  show  that  this  pulse,  together  with  pitch  information  for  the 
original  waveform,  is  sufficient  to  recover  the  waveform,  we  synthesized  the  wave¬ 
form  of  Fig.  29b,  using  pitch  information  obtained  from  Fig.  27b.  We  note  that  the 


Fig.  29.  Example  of  speech  deconvolution,  (a)  Original  speech  (same  as 
Fig.  27a).  (b) Synthesized  speech  using  the  pulse  in  (c)and  pitch 
information  in  Fig.  27b.  (c)  Pulse  obtained  from  the  cepstrum 
of  Fig.  27c,  using  the  values  for  Ini  <  40.  (Note:  sampling  rate 
is  10  kHz.) 


waveforms  (a)  and  (b)  are  not  exactly  the  same,  but  they  are  quite  similar  in  most 
details.  From  this  example  and  Fig.  27b,  it  seems  clear  that  the  assumptions 
employed  in  section  4.  3  are  justified  in  this  case.  That  is,  the  component  vg  is 
not  affected  to  a  great  extent  by  the  weighting,  while  the  pitch  impulses  retain  the 
shape  of  the  window. 
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V.  APPLICATIONS  TO  ECHO  REMOVAL  AND  DETECTION 
5.  1  A  SIMPLE  EXAMPLE 

We  have  presented  the  theoretical  and  computational  details  of  homomorphic  decon¬ 
volution.  Now  we  shall  be  concerned  with  the  application  of  these  techniques  to  the  pro¬ 
cessing  of  signals  containing  echoes.  To  illustrate  our  approach,  let  us  consider  a 
simple  example.  We  can  easily  obtain  analytical  results  for  this  example,  and  we  shall 
also  present  computational  results  for  comparison. 

Suppose  we  have  a  sequence  x  whose  values  are 

x(n)  =  s(n)  +  astn-n^  =  s(n)  <g>  p(n),  (108) 

where  the  sequence  s  has  values 

s(n)  =  na11  0  ^  n  <  M 

=  0  elsewhere. 

Furthermore,  0  <  a  <  1,  and 

p(n)  =  6(n)  +  afifn-np. 

The  s -transform  of  Eq.  108  is 


X(z)  =  S(z) 


Kn‘). 


(109) 


where  S(z)  can  be  shown  to  be 


S(z)  = 


az  !(l-aMz  M)  MaMz-M 


- 1  2  -  ”  • 
(1-az  )  1  -  az 

M 


If  M  is  large,  a  approaches  zero,  so  that  for  large  M 
-1 


az 


S(z)  « - _  r  2 . 

(l-az  Y 

The  logarithm  of  X(z)  is 


X(z)  =  log  [S(z)]  +  log  £  1+az  1  j  , 


which  can  be  written 


-n. 


A  _]  _1  "I 

X(z)  -  log  a  +  log  z  -  2  log  [l-az  ]  +  log  1+az 
Using  the  Laurent  series  expansion  of  the  logarithm,  we  can  write  Eq,  110 


(110) 


a  ,  r'  ~  m  v '  .  .  m  Tun. 

X(z)  =  log  a  +  log  z-1  +  ^  z~m  +  ^  (-')  1  If2  l-  (in> 

m=  1  m=  1 

If  we  choose  the  contour  of  integration  to  be  the  unit  circle,  and  remove  the  term  log  z  * 
by  shifting  the  sequence  one  place  to  the  left,  we  obtain  for  the  complex  cepstrum 

«  k  »  k 

x(n)  =  (log a)  6(n)  +  £  6(n~k)  +  Z  (~1)k+1  V  6<n_klV*  (11Z) 

k=l  k=l 

From  Eq.  112  we  see  that  the  contribution  'ue  to  p  has  samnles  spaced  at  intervals 
of  n.  while  the  samples  due  the  sequence  s  have  unit  spacing.  Clearly  both  approach 

1  i 

zero  faster  than—,  but  since  the  samples  due  to  p  have  greater  spacing,  the  part  due 
to  s  will  occupy  primarily  the  "short-time"  region  while  the  part  due  to  the  echoes  will 
be  in  the  "long  time"  region. 

This  example  was  actually  computed  as  discussed  in  Section  III  for  a  =  .  96,  a  =  .  5, 

M  =  800,  and  nj  =  80.  The  value  of  N  for  the  DFT  was  2048.  The  sequences  s  and  x 
are  shown  in  Fig.  30a  and  30b,  respectively.  In  Fig.  30c,  we  show  the  samples  of  the 
real  part  of  Eq.  110  for  z  =  eJt0.  That  is.  Fig.  30c  is 


—  k 


log  |X(k)  |  =  log  a  -  2  log  1  -a  e  ^  +  log  1  +  a  e 


-3#n,k 


In  Fig.  30d  we  show  the  principal  value  of  the  phase  of  X{k).  This  includes  the  linear 
phase  component,  as  we  can  see,  since 


ARG[x(f-l)]«-., 


and  the  net  number  of  positive  and  negative  jumps  between  k  =  0  and  k  =  -y  -  1  is  zero. 
Figure  30e  shows  the  phase  curve  after  adding  the  corrections  involved  in  removing  the 
discontin'>:.ies  and  rotating  the  sequence.  Figure  30f  shows  the  complex  cepstrum  for 
this  example.  We  note  that  the  part  attributable  to  s(n)  is  primarily  concentrated  between 
n  =  0  and  n  =  80,  while  the  impulses  attributable  to  the  echo  occur  at  n  =  80,  160,  etc. 

We  should  point  out  that  after  rotating  to  the  left  one  sample,  the  input  sequence  is  mini¬ 
mum  phase,  so  that  x(n)  =  0  n  <  0. 

Let  us  now  consider  huw  wo  might  choose  a  frequency-invariant  linear  system  to 
recover  each  of  the  components  s  and  p  from  the  complex  cepstrum.  From  Fig.  30b, 
we  see  that  to  recover  the  sequence  s,  we  must  remove  the  impulses  caused  by  p.  One 
way  of  doing  this  is  to  form 

$(n)  =  f(n)  x{n), 

where  £(n)  is  as  in  Fig.  31a,  and  nc  <  80.  This  type  of  linear  system  is  appropriately 


termed  a  short-pass  system.  With  this  system,  we  would  remove  all  of  the  contribu¬ 
tion  caused  by  p,  and  part  of  that  caused  by  s.  The  analogy  to  lowpass  filtering  should 
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Fig.  31.  Frequency -invariant  linear  systems  for  (a),  (b)  echo 
removal,  and  (c)  detection. 

be  clear.  If  we  consider  Fig.  30c  and  30e,  we  see  that  both  the  log  magnitude  and  the 
phase  are  made  up  of  the  sum  of  a  slowly  varying  component  caused  by  s,  and  a  more 
rapidly  varying  component  caused  by  p.  In  attempting  to  recover  s,  we  must  remove 
the  rapidly  varying  components  while  leaving  the  slowly  varying  component  relatively 
unaltered.  Alternatively,  if  we  have  accurate  knowledge  of  the  echo  time  (80  samples 
in  this  case),  a  comb  system  such  as  that  shown  in  Fig.  31b  would  remove  the  contribu¬ 
tion  from  p  and  would  retain  almost  all  of  the  part  from  s.  In  either  case,  operating 
on  p(n)  with  the  inverse  system  should  produce  an  approximation  to  s(n).  The  compu¬ 
tational  example  was  carried  out  for  these  choices  and  the  results  are  shown  in  Fig.  32. 
In  Fig.  32c  we  show  the  output  for  the  comb  system  with  nc  =  80.  Clearly,  it  is  not  pos¬ 
sible  to  distinguish  Fig.  32c  from  Fig,  30a.  Similarly,  we  chose  nc  =  79  in  the  short- 
pa  cs  system  and  obtained  the  output  shown  in  Fig.  32d.  Again,  Fig.  32d  is  not 
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Fig.  32.  Output  waveforms  for  the  example  of  Fig.  30 


distinguishable  from  Fig.  30a;  however,  in  Fig.  32c  we  have  shown  the  output  when  the 
short-pass  system  was  used  with  nc  =  16.  A  look  at  Fig.  30f  shows  that  much  of  s(n) 
has  been  deleted  by  the  linear  system,  and  therefore  we  should  not  expect  the  output  to 
look  exactly  like  s(n).  The  fact  that  the  shape  is  the  same  is  the  result  of  the  simplicity 
of  the  distribution  of  zeros  of  S(z),  and  the  fact  that  s  is  minimum-phase. 

We  shall  say  more  about  the  implications  of  minimum  phase  in  this  example,  but  first 
let  us  consider  the  problem  of  recovering  the  sequence  p.  From  the  complex  cepstrum 
in  Fig.  30f  it  is  clear  that  if  we  choose  the  linear  system  of  Fig.  31c,  with  nc<80,  most 
of  s(n)  will  be  deleted,  and  we  shall  retain  the  part  attributable  to  p.  Such  a  system 
could  be  appropriately  called  a  long-pass  system.  In  Fig.  32a  we  show  the  output  of  the 
inverse  system  for  a  long-pass  linear  system  with  nc  =  79.  We  note  that  very  clearly 
we  have  recovered  a  sequence  whose  values  are  very  close  to  p(n).  In  Fig.  32b  we  show 
the  output  for  n  =16.  In  this  case,  we  have  retained  a  significant  part  of  s(n),  and  thus 
the  impulses  of  p  are  convolved  with  the  filtered  s(n).  Since  most  of  the  significant  part 
of  s(n)  was  removed,  however,  the  part  of  the  output  from  s(n)  is  very  small. 

We  have  previously  noted  that  the  input  sequence  was  minimum-phase.  Thus  the 
recursive  algorithm  will  give  the  same  result  as  that  obtained  by  using  the  integral 
expressions  on  the  unit  circle.  The  recursion  formula  for  the  inverse  system  is 


y(n)  =  e^°)  n  =  0 

n-1 

=  y(0)y(n)+  £  ~  $(k)  y(n-k)  n  >  0, 
k=0 


where  y(n)  =  f(n)  x(n). 

The  recursive  expression  helps  us  to  understand  the  appearances  of  each  of  the  out¬ 
puts  in  Fig.  32.  For  example,  in  both  Fig.  32a  and  32b,  y(0)  =  0.  Since 


7(0)  =  e^(0), 


we  see  that  since  y  is  minimum  phase  and  y(0)  =  0,  then  y(0)  =  1.  Similarly,  we  see 
that  if  $(n)  =  0  for  0  ^  n  <  nc.  then  y(n)  =  0  for  0  <  n  <  nc»  This  is  shown  by  both 
Fig.  32a,  where  nc  =  79,  and  32b,  where  nQ  =  16.  We  note  that  this  explains  why  the 
contribution  attributable  to  s(n)  in  Fig.  32b  begins  16  samples  from  each  impulse. 

In  Fig.  32c,  32d,  and  32e,  we  recall  that  £{n)  =  x(n)  for  0  «  n  <  n£.  Thus  from  the 
recursion  formula,  we  see  that  y(n)  =  x(n)  for  0  *£  n  <  nc  (within  computational  accuracy). 
Since  x(n)  =  s(n)  0  <  n  «  n  in  each  case,  we  see  that  y(n)  =  s(n)  for  0  n  <  n  .  Because 

A  ^  ® 

s(n)  is  very  small  for  n  S'  80,  we  see  that  y(n)  »  s(n)  for  n  ^  n£  in  Fig.  32c,  and  that.  y(n) 
should  be  a  better  approximation  in  Fig.  32d,  since  the  comb  system  retains  most  of 
s(n).  In  Fig.  32e,  we  note  that  y(n)  =  s(n)  0  «  n  <  16,  but  for  n  >  16,  the  output  decays 
much  too  fast.  This  can  clearly  be  accounted  for  by  the  recursion  formula. 

This  example  has  illustrated  many  of  the  important  concepts  in  the  use  of 
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homomorphic  deconvolution  in  processing  signals  containing  echoes.  We  have  seen  that 
it  is  possible  to  remove  the  echo  from  the  sequence  x  by  using  a  short-pass  or  comb 
linear  system.  Furthermore,  we  have  seen  that  it  is  possible  to  recover  the  impulse 
train  by  using  a  long-pass  system.  We  note  that  this  impulse  train  can  be  used  in  echo 
detection,  and  we  shall  see  that  it  has  advantages  over  using  either  the  cepstrum  or 
complex  cepstrum  for  this  purpose.  It  should  be  clear  that  the  characteristics  of  the 
complex  cepstrum  of  the  impulse  train  are  of  primary  importance  in  choosing  the  linear 
system.  Thus,  we  shall  next  study  this  question  in  detail. 

5.  2  COMPLEX  CEPSTRUM  OF  AN  IMPULSE  TRAIN 

An  impulse  train  is  defined  as  a  sequence  in  which  the  nonzero  samples  are  spaced 
at  intervals  that  are  greater  than  one.  An  example  of  such  a  sequence  is  the  sequence 
whose  values  are 


?• 


M-l 

p{n)  =  6(n)  +  ^  ck6(n-nk).  (113) 

k=l 

Sequences  of  this  kind  can  be  used  to  represent  signals  that  contain  echoes  or  reverbera¬ 
tions.  For  example,  the  sequence  whose  values  are 

M-l 

x(n)  =  s(n)  +  ^  aks(n-nk), 
k=l 

can  be  represented  as  the  convolution 
x  =  s  ®  p, 

where  the  values  of  p  are  given  by  (113),  and  s  it.  a  sequence  whose  values  are  spaced 
at  unit  intervals.  The  complex  cepstrum  of  x  has  values 

x(n)  =  s(n)  +  p(n). 

We  have  seen  that  even  if  the  sequence  is  of  finite  length,  the  complex  cepstrum  s  is  in 
general  of  infinite  extent.  Most  of  the  energy  in  s  is  concentrated,  however,  in  an 
interval  of  the  same  order  of  magnitude  as  the  duration  of  s.  We  saw  in  the  example 
that  the  impulse  train 


l 


This  component  may  be  clearly  seen  in  Fig.  30d.  Thus,  in  this  simple  case,  ve  see 
that  the  complex  cepstrum  is  also  an  impulse  train  with  samples  spaced  at  intervals  of 
n^.  We  have  also  seen  that  the  two  sequences  &  and  p  may  occupy  essentially  different 
intervals  so  that  ii  is  possible  to  recover  either  s  or  p  by  using  very  simple  frequency- 
invariant  systems.  Since  we  must  have  some  knowledge  of  the  complex  cepstrum  of  the 
impulse  train  in  order  to  choose  the  linear  system,  we  shall  consider  in  detail  the  prop¬ 
erties  of  the  complex  cepstrum  of  an  impulse  train. 

Let  us  consider  an  impulse  train  such  as  Eq.  113.  The  corresponding  z-transform 
is 


P(z)  =  1  + 


M-l 

I 


k=l 


-n 

akz 


k 


(114) 


In  general,  the  analytical  determination  of  the  complex  cepstrum  corresponding  to  (114) 
is  quite  difficult  because  it  is  quite  hard  to  determine  the  zeros  of  the  right-hand  side 
of  (114).  In  the  special  case  wherein  the  impulses  are  equally  spaced,  however,  it  is 
possible  to  give  a  general  result.  Consider  an  impulse  train 


M-l 

P(n)  =  ^  <xk6(n-kno). 
k=0 

Let  us  define  a  sequence  q  having  values 
q(n)  =  an  0  «  n  <=  M  -  1 

=  0  elsewhere. 

Thus  we  see  that 

P(n)  =  q  n=0,no . (M-l)nQ 


i 

! 


=  0  elsewhere. 

The  z-transform  of  p  is 


-kn  ^  /  n  \~k  /  n  \ 

p(z)=  ^  V  =  £  ak  \z  )  =Q(Z  °)- 


k=0  k=0 

Thus  it  is  clear  that  p  will  be  given  by 


p(n)  = 


=  C 


n  =  0,  ±nQ,  ±2nQ, ... 


elsewhere. 
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That  is,  for  impulse  trains  with  equal  spacing,  the  complex  cepstrum  is  also  an  impulse 
train  with  the  same  spacing.  We  note  that  even  though  p  is  a  finite-length  impulse  train, 
p  is  in  general  of  infinite  duration. 

If  the  spacing  of  the  samples  is  not  uniform,  it  is  not  possible  to  use  the  argument 
above.  Since  we  are  not  able  to  give  a  general  result,  we  shall  illustrate  the  range  of 
possibilities  with  several  examples. 

Example  1 

Let  nk  =  knQ  and  ak  =  a  in  Eq.  114,  so  that  P(z)  can  be  written 
M-l 

P(z)  =  )  «z  °.  (115) 

k=0 


By  a  simple  manipulation  of  (115),  we  can  obtain  the  more  convenient  form 


P(z)  = 


(-’•I 


(116) 


The  logarithm  of  P(z)  is 


P(z)  =  log  (l-aMz  °)  -  log  (l-az  ^  , 


and  if  ■.%  <  1 ,  we  may  write 


*  k  -kn  v-'  Mk  -kMn 

-I  t-  #-I  V*  0 


k=l 


k=l 


(Note  that  such  an  expansion  implies  that  the  region  of  convergence  includes  the  unit 
circle.)  Therefore,  the  complex  cepstrum  is 

£  k  ?  kM 

P(n)  =  2  T  6<n"kn05  "  2  £lT6(n"kMno)-  (**') 

k=l  k=l 


The  two  sequences  making  up  p  are  shown  in  Fig.  33  for  M  =  3.  Thus,  we  see  that  if 
the  echoes  are  exponentially  decreasing  in  amplitude  and  also  equally  spaced,  the 
complex  cepstrum  is  an  impulse  train  with  the  same  spacing.  We  also  note  that  p 
is  minimum-phase  for  |a|  <  1,  and  therefore  p(n)  =  0  for  n  <  0. 


i 
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-? 


(b) 


Fig,  33.  (a)  Complex  cepstrum  for  Eq.  115. 

(b)  Complex  cepstrum  for  Eq.  118. 


Example  2 


Let  us  suppose  that 


-n.-n  -2n.  -3n.  -n 

P(z)  =  I  +  cz  °  +  pz  1  +  ctf3z  , 

where  |pj  >  1  and  |«j  <1.  Equation  118  may  bt  itten 


(118) 


-2n,  /  -n.  -n  \  /  ,  2n,  \ 

P(z)  -  Pz  ^l+az  ^^l+p”1*  1j, 

and  the  logarithm  of  P(z)  is  therefore 

A  -2n.  /  -n.-n  \  /  ,  2n  \ 

P(z)  =  log  P  +  log  z  +  log  ^1+az  )  +  log  (  1+P  z  ).  (119) 

If  we  assume  that  the  region  of  convergence  contains  the  unit  circle,  we  may  expand  (119) 
as 
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P(z)  =  log  P  +  log 


8  +  lOg  Z  Rl  +  Y  (-l)k+1 


l  <-»>  r* 

k=l 


ak  ~(n1+no)k 


I" 


{c+1  P  2njk 
!>  — *  * 


If  we  also  assume  that  the  term,  log  z  1  is  removed  in  computing  the  phase  curve,  we 
see  that  p(n)  can  be  written 


p(n)  =  (logp)  6(n)  +  ^  <‘l)k+1T~  6(n-kn1-knQ)  +  ^  (-1)K+1-^-  Kn+kZnJ. 


The  sequence  is  depicted  in  Fig.  33b. 

We  note  that  in  this  case  the  sequence  p  is  nonminimum-phase  after  a  shift  left  of 

2n^  samples,  and  therefore  p(n)  *  0  for  n  <  0.  We  note,  however,  that  p  is  again  an 

impulse  train  with  spacing  related  to  the  spacing  of  the  sequence  p.  Clearly,  it  would 

be  difficult  to  detect  all  echoes  in  this  case  by  using  only  the  impulses  in  the  complex 

cepstrum.  If,  however,  we  use  a  long-pass  system  that  replaces  the  samples  in  the 

interval  -2n,  <  n  <  n,  +  n  with  zero,  and  then  transform  the  result  with  the  inverse 
1  1  o 

system,  we  shall  obtain  an  output  placing  in  evidence  all  of  the  samples  of  the 
sequence  p. 

Example  3 

Let  P(z)  be 

"nl  ~n2 

P(z)  =  1  =  CjZ  1  +  a2  z  (120) 

where  n2  >  n  .  The  logarithm  of  P(z)  i3 

a  /  “nj  ~no\ 

P(z)  =  log  ^1+ftjZ  +a2z  “). 

A 

If  we  assume  that  the  region  of  convergence  of  P(z)  contains  the  unit  circle  and  that 
-jwn.  -jwn,, 

max  a,  e  1  4-  e  "  <1, 

then  we  may  write 

/  "nt  ~n,\k 

00  r  z  +a  z 

a  V*  k+1  \  1  2  / 

P(z)  =  )  (-1)K+1  -5— - .  (121) 

U  k 

k=l  K 

(We  shall  say  more  about  this  restriction  on  P(z)  after  this  example.)  The  binomial 
term  may  be  expanded  as 
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(  ~n  -n_, 

alz  +a2z 


„k  k 


-kn.+rn ,-rn. 
-r  r  112 

a2z 


!)  •  i  (»  ^ 

'  r=0 

where  the  quantity  ^k^  is  the  binomial  coefficient 


(S) 


k! 


(k-r)lr! 


Thus  we  can  write  (121) 


P(z)  =  £  (-l)k+1 
k=l 


tie) 


k-r  r  -(*“1  -rn  +r”  ) 

“1  v 


r=0 


and  the  complex  cepstrum  is  therefore 


P(n)  =  f  (-J)k+1Tl  (r)  ^~r<I26<n“knl+rnl“rn2>* 
k~l  r=0 

This  expression  places  in  evidence  several  important  facts.  First,  we  note  that  p(n)  = 

0  n  <  0.  This  is  a  result  of  our  assumption  regarding  and  a 2.  Second,  we  can  see 
that  the  impulses  occur  at 

n  =  (k-r)nj  +  rn2  k  =  1,  2,  . . .  and  r  =  0,  1 . k. 

The  values  of  p(n)  are  given  in  Table  1  for  1  <  k  £  6. 

The  most  striking  thing  about  Examples  2  and  3  is  how  much  more  complicated  the 
complex  cepstrum  becomes  when  the  impulse  train  is  nonminimum  >phase,  or  when  the 
impulses  are  not  equally  spaced. 

In  general,  we  must  consider  impulse  trains  of  the  form  of  Eq.  113,  and  clearly 
if  the  impulse  train  is  both  not  equally  spaced  and  nonminimum-phase,  the  complex 
cepstrum  will  have  impulses  located  throughout  the  range  -^<n  <  ».  It  is  to  Dis¬ 
advantage  if  the  impulses  in  the  complex  cepstrum  occur  onl)  in  the  "long-time" 
region.  Since  the  spacing  of  the  impulses  is  not  under  our  control,  we  can  only 
look  at  the  possibility  of  making  the  impulse  train  minimum -phase.  With  respect 
to  this  question  we  can  make  some  definite  statements.  We  recall  that  exponential 
weighting  can  be  used  to  make  a  sequence  minimum-phase.  We  also  recall  that  if 
x  is  a  convolution,  its  values  are  given  by 


x(n)  = 


p(n-k). 


The  exponentially  weighted  sequence  can  be  written 
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(3nx(n)  =  2  Pks<k)  Pn_kp(n-k). 
k 

That  is,  each  member  of  the  convolution  is  also  exponentially  weighted.  Therefore, 
exponential  weighting  can  be  used  to  make  the  impulse  train  minimum -phase.  This 

Table  1.  Locations  and  values  of  the  impulses  in  the  complex  cepstrum  of 
p(n)  =  6(n)  +  a^n-nj)  +  a26(n-n2). 


will  then  insure  that  the  part  of  the  complex  cepstrum  attributable  to  p  will  occur  only 
in  the  region  n  >  nj ,  '•’ere  nj  is  the  spacing  between  the  first  and  second  impulses. 

In  general  we  are  concerned  with 

M-l  -n 

P(z)  =  i  +  Y  V  k>  <122> 

k=l 
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■A 

4 


V 

$ 


where  n.  <  n,  <  . . ,  <  n  Since  exponential  weighting  effectively  replaces  a,  in 

i  c  m-i  k 

Eq.  122  by  p  a^,  it  is  advantageous  to  have  at  least  a  sufficient  condition  that  these 
quantities  must  satisfy  in  order  that  the  sequence  be  minimum-phase.  Such  a  condition 
is  easily  obtained  since  the  requirement  of  minimum  phase  is  equivalent  to  requiring 

that 


P(z)  =  log 

r  M-l  1 

■  ♦  I  V  k 

(123)  | 

k=l 

1 

have  a  power-series  expansion  converging  in  a  region  |  z  |  >  a,  where  0  <  a  <  1.  A  suf¬ 
ficient  condition  for  this  to  be  true  is 


max 

-tr<w<ir 


.-jwnk 


The  condition  of  (124)  is  satisfied  if 
M-l 


I  I 


V  <u 


Thus  (125)  constitutes  a  sufficient  condition  for  the  sequence  corresponding  to  (122)  to 
be  minimum -phase.  If  the  do  not  satisfy  (125),  the  sequence  may  still  be  minimum- 
phase  (as  in  Example  1);  however,  in  general  we  can  always  insure  that  the  impulse 
train  is  mini  . num -phase  by  choosing  p  so  that 


M-l  .  „ 


Since  we  do  not  generally  know  in  advance  the  size  of  the  we  must  rely  on  approxi¬ 
mate  information  about  the  shortest  echo  time,  the  spacings,  and  the  relative  sizes  of 
the  echoes  in  order  to  choose  the  proper  value  for  p. 

In  conclusion,  we  note  that  if  the  nequence  p  is  minimum-phase,  then  the  complex 
cepstrum  may  be  obtained  from  the  following  expansion  of  Eq.  123: 


oo  /  M-l 

™  ■  l  I 

k=l  \  r=0 


In  this  case  we  must  use  the  multinomial  expansion  for 


Z  o  z 
r=0 


~nr\k 


,  and  clearly  the 


result  will  be  a  rather  complicated  distribution  of  impulses  in  the  complex  cepstrum. 


We  can  state,  however,  that  there  will  be  no  impulses  in  the  complex  cepstrum  for 
n  <  nj.  Furthermore,  the  impulse  sizes  will  approach  zej-o  for  large  n  and  will  occur 
at 

M-l 

n=  £  mkV 

k=l 

where  the  m^  take  on  all  positive  integer  values. 

5.3  DISTORTED  ECHOES 

Up  to  this  point,  we  have  assumed  that  the  echoes  were  exact  replicas  of  some 
sequence  s.  In  practice,  of  course,  we  are  interested  in  sampled  continuous-time 


Fig.  34.  Linear  model  for  the  production  of  echoes. 

signals,  and  this  assumption  is  generally  not  exactly  correct.  A  more  realistic  model 
for  the  generation  of  continuous  time  signals  containing  echoes  is  shewn  in  Fig.  34. 
The  input  s(t)  is  assumed  to  be  bandlimited,  and  the  h^ft)  specify  the  continuous¬ 
time  impulse  responses  of  M  -  I  linear  systems  corresponding  to  M  -  1  different 
echo  paths.  Thus  x(t)  is  bandlimited,  and  is  given  by 

x(t)  =  s(t)  ®  [uQ(t)+a  jhj(t-Tj )+  . . .  + 

If  we  sample  x(t)  (assuming  that  the  Nyquist  frequency  is  1),  it  can  be  shown 
that  the  samples  of  x(t)  are  given  by 

x(n)  =  s(n)  ®  p(n) 


where 


P(n)  =  6(n)  +  (n-iij)  +  . . .  +  aM_jhA 

1  M-l 

In  Eq.  126,  the  nk  are  integers  and 
Tk  =  °k  ‘  Ak’ 


(126) 


where 


0  <  Ak  <  1. 


Thus,  the  h^  (n)  are  the  samples  of  h^ft+A^);  that  is, 


hAk<n)  =  hk(n+Ak). 


The  Fourier  transform  of  the  sequence  p  is 


M-l 


P(ejw)  =  1  +  £  «kHA  (ejw) 


-Jwn. 


(127) 


k=l 


where  (eJw)  is  the  Fourier  transform  of  the  samples  of  h(t+Ak),  and 

Ha  (ejw)  =  Hk(eju)  e^. 
k 

Hk(e"*w)  is  the  Fourier  transform  of  the  samples  of  hk(t). 

This  discrete  model  accounts  for  the  fact  that  each  echo  may  be  distorted  by 
its  transmission  path  and  also  for  the  small  shift  encountered  if  the  echo  delay 
time  is  not  an  integer  multiple  of  the  sampling  period. 

We  recall  that  the  analysis  for  multiple  echoes  was  quite  difficult,  and  compli¬ 
cated  expressions  were  obtained  for  the  complex  censtrum.  We  also  saw,  however, 
that  the  simple  case  of  a  single  echo  illustrates  most  of  the  important  concepts. 
Thus,  for  clarity,  let  us  consider  the  special  case 

P(ejw)  =  1  +  aHA(ejw)  e  ^ , 

where  HA(e^w)  is  the  transform  of  the  sequence  of  samples  of  a  continuous-time,  band- 
limited,  impulse  response  h(t+A).  If  we  assume  that 


max  aHA{e^w) 

-w  <  «  <  ff 


<  1. 


then  the  logarithm  of  P(e^w)  can  be  written 
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(128) 


i 

i 

i 

i 


! 


i  -jwn, 
l  +  aHA(eJw)e  1 


J“)  =  log  [] 

■  1  l-lk+1^[Hi(eh]k 

k=l 


-jwkn 


The  condition  on  |aHA(e^u>)  |  can  be  satisfied  for  all  minimum-phase  systems  and  many 
nonminimum -phase  systems  by  exponential  weighting  of  x(n),  since  pnx(n)  will  have  the 
transform 


X(P_1eJw)  =  S(p_1ejw)  P(p'1ejw), 


where 


P(p_Iejw)  =  1  +  ap^H^p^e*4*)  e 


Thus  we  must  choose  p  so  that 


max 
-ir  <u  <  n 


ap  lHA<p’1ei^ 


<  1. 


The  complex  cepstrum  for  (128)  is  given  by 

k 

(-1)*T1  -r 
k=  1 


P(n) 


■  l  (-l)k+1#[,k)h,n-k„1)]. 


where 


(k)h(n)  =  27^  [HA(ejw)Jk  ejwn  dw. 

We  also  note  from  (130)  that  ^hA(n)  satisfies 
(k)hA(n)  =  (k_1)hA(n)  ®  hA(n) 


(129) 


(130) 


where 

(0)hA(n)  c  6{n). 

We  see  that  if  A  =  0  and  hA(n)  =  6{n),  Eq.  129  reduces  to 
V  k 

P(n)  =  2  <~1)k+i‘t~  6<n-kn!). 
k=l 
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as  we  would  expect.  If  this  is  not  the  case,  then  the  impulses  in  the  complex  cepstrum 
are  dispersed  by  convolution  with  ^h.(n).  in  general,  if  the  systems  H.  (e^40)  are 


wideband, then  the  corresponding  impulse  response, h  (n),will  be  of  short  duration.  On 

i  k 

the  other  hand,  if  H .  (eJW)  is  narrow-band,  the  impulse  response  will  not  approach  zero 

k. 

very  rapidly.  In  the  first  case,  the  impulses  in  the  complex  cepstrum  will  be  convolved 
with  relatively  sharp  pulses,  while  in  the  latter  case  the  impulse  will  be  convolved  with 
sequences  that  are  rather  broad  and  dispersed. 

To  see  how  this  affects  our  results,  let  us  consider  two  examples. 


Example  4 

Consider  a  single  echo  path  for  which  h(n)  =  a11  n  5  o,  where  0  <  a  <  1.  It  can  be 
shown  that 


(n+k-1) ! 
nl(k-l)! 


n  5*  0. 


(k) 

Thus  we  see  that  '  'h(n)  becomes  increasingly  spread,  because  of  the  successive  con¬ 
volutions,  and  thus  the  peaks  at  large  values  of  knj  will  be  considerably  smeared 


R 


Qjh(n-nj) 


.i  oihh-M 
2  1 


2n. 


3n, 


4n. 


(b) 


Fig.  35.  Impulse  train  for  distorted  echoes,  (a)  Sequence  p(n)  =  6(n)  +  ajh(n~nj) 
for  h(n)  =  a11  n^O.  (b)  Complex  cepstrum  for  p(n). 
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out.  This  is  especially  true  if  a  is  close  to  1.  Figure  35  shows  a  plot  of  p(n)  in  (a)  and 
p(n)  in  (b)  for  this  example.  (The  samples  have  been  connected  by  straight  lines  for  ease 
in  plotting.) 

Example  5 

Let  us  consider  a  single  distorted  echo  obtained  by  filtering  with  a  system  whose 
system  function  is 

j60sq{u>) 

# 

where  sq(«)  is  given  by 

sq(«)  =  1  0  <  «  <  ir 


H(eiu) 


=  -1  -IT  <  (0  <  0, 


and  sq(to)  =  sq(w+r2ir)  r  =  0,  ±1,  ±2 .  Such  distortions  can  occur  in  acoustic-wave 

reflections.1 1  Thus  we  can  write 


jk0osq(w)  -jwknj 
e  e 


j0  ksq(w) 

It  can  be  shown  that  the  sequence  whose  transform  is  e  has  values 


-2  sin  0  k 
_ o 

irn 


n  *  0 


=  cos  0  k  n  =  0. 

o 

The  complex  cepstrum  p(n)  is  given  by 

P(n)  =  £  (-l)k+1-^  (k)h(n-knj). 
k=l 

(k) 

In  this  case,  the  shape  of  the  sequence  '  'h(n)  remains  the  same  for  all  values  of  k,  but 
we  note  that  the  relative  size  of  the  values  of  the  sequence  depends  on  the  phase  angle 
0Q.  In  this  case,  and  in  the  previous  example  wherein  a  is  close  to  1,  it  may  be  quite 
difficult  to  remove  the  echo,  since  it  will  require  a  comb  system  to  remove  the  compo¬ 
nents  ^h(n-knj)  without  significantly  disturbing  the  part  that  is  due  to  s(n). 

5.4  LINEAR  SYSTEMS  FOR  ECHO  REMOVAL  AND  DETECTION 

We  have  seen  that  a  simple  model  for  a  signal  containing  ec  .oes  is  the  convolution 
x  =  s  p,  where  p  is  an  impulse  train  in  which  the  samples  are  spaced  at  intervals 
greater  than  one.  In  section  5.  1 ,  we  presented  an  example  that  indicated  that  the 
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components  s  and  p  may  be  recovered  by  using  frequency -invariant  linear  systems  of 
the  form 

y(n)  =  £(n)  x(n). 

We  have  shown  that  the  essential  concepts  of  the  simple  example  are  true  in  general, 
even  thoug.i,  in  many  cases,  the  complex  cepstrum  becomes  quite  complicated.  We  now 
wish  to  clarify  our  definitions  and  terminology  and  discuss  some  of  the  details  of  using 
frequency- invariant  linear  systems  for  echo  removal  and  detection. 

We  recall  that  in  section  5.  1  we  introduced  the  terms  "  short  -pass,"  "long-pass,"  and 
"comb"  systems.  Precise  definitions  of  these  systems  will  now  be  given. 

An  "ideal  short-pass  system"  is  a  frequency -invariant  linear  system  for  which 

£(n)  =  1  -n_  <  n  <  n+ 

=  0  elsewhere, 

where  n_  and  n+  are  integers.  Such  a  system  is  shown  in  Fig.  36a. 

An  "ideal  long-pass  system"  is  defined  as 

£(n)  =  0  -n_  <  n  <  n+ 

=  i  elsewhere, 

where  n_  and  n+  are  integers.  This  system  is  shown  in  Fig.  36b. 

An  "ideal  comb  system"  is  defined  as 

An.  An. 

£(n)  =  0  nk — 2i<n<nk  +  -2i  k*l,2,  ... 

=  1  elsewhere. 

In  general,  nk  can  be  a  positive  or  negative  integer.  Such  a  system  is  shown  in  Fig.  36c. 
We  note  that  in  computation,  the  z -transform  is  replaced  by  the  FFT,  and  x  is  replaced 
by  the  periodic  sequence  x.  Thus  for  computation,  £(n)  is  effectively  periodic,  although 
we  only  work  with  one  period  of  the  sequence  x. 

If  N  [the  number  of  samples  of  X(eJW)]  is  fairly  large,  it  is  faster  to  compute  the 
complex  cepstrum  and  multiply  by  £{n)  than  to  do  the  equivalent  convolution  of  the  trans¬ 
form  of  £(n)  with  the  transform  of  x. 17  The  ideal  systems  of  Fig.  36  are  easily  realized 
in  the  computer  by  replacing  the  contents  of  appropriate  registers  with  zero.  It  is  well 
known  that  such  sharp  cutoff  systems  will  produce  ripples  in  the  transform  of  y;  there¬ 
fore,  in  many  cases,  it  may  be  preferable  to  use  approximations  to  these  ideal  systems 
which  have  smoother  transitions  between  one  and  zero. 

In  considering  the  advantages  of  homomorphic  deconvolution  over  linear  inverse  fil¬ 
tering,  the  basic  consideration  is  the  amount  of  information  about  the  signals  which  is 
required  to  design  the  system.  Clearly,  in  inverse  filtering  we  must  have  a  good 
approximation  to  the  signal  that  is  to  be  removed.  If  we  do  know  this  signal,  a  scheme 
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that  is  equivalent  to  inverse  filtering  is  subtracting  its  complex  cepstrum  from  that  of 
the  input.  In  this  case  we  could  recover  the  desired  signal  by  using  the  system  D  *.  One 
cannot  do  better  than  this.  In  choosing  to  remove  the  undesired  component  with  a  linear 


(b) 


,  /(") 

1 

An  | 

- 
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| 
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- 

"1 
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"3 

n 

(c) 

Fig.  36.  Ideal  frequency- invariant  linear  systems:  (a)  short-pass; 
(b)  long-pass;  (c)  comb. 


o 


system,  we  can  expect  to  obtain  only  an  approximation  to  the  desired  signal.  The  advan¬ 
tage  of  homomorphic  deconvolution  is  that  for  signals  that  are  convolutions  of  an  impulse 
train  with  a  sequence  having  its  samples  spaced  at  unit  intervals  the  complex  cepstra  of 
the  two  components  are  somewhat  separated  "in  time."  In  this  case,  only  partial  infor¬ 
mation  about  the  signals  is  required. 

From  the  examples  that  we  have  given,  we  can  see  that  it  is  possible  to  remove 
either  of  the  two  components  if  we  are  given  only  partial  information  about  the  impulse 
train.  For  example,  if  the  impulse  train  is  minimum-phase  (or  has  been  made  minimum- 
phase  by  exponential  weighting),  and  we  know  the  smallest  echo  delay,  we  can  recover 
the  impulse  train  by  using  a  long-pass  system  that  is  zero  for  all  n  less  than  the 
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smallest  echo  time.  We  note  that  the  impulses  will  be  convolved  with  the  sequence 
y  g  =  D~1[S-i]. 

The  values  of  this  sequence  will  normally  be  quite  small  if  the  smallest  echo  time  is  long 
enough  so  that  most  of  the  energy  of  s(n)  is  removed  by  the  long-pass  system. 

A  similar  method  is  suggested  for  recovering  the  signal  that  is  convolved  with  the 
impulse  train.  If  we  use  a  short-pass  system  that  is  equal  to  0  for  all  n  greater  than 
the  smallest  echo  time  and  equal  to  1  for  n  less  than  that  value,  we  shall  remove  the 
component  attributable  to  the  impulse  train  completely.  We  shall  also  remove  part  of 
the  complex  cepstrum  of  s.  In  most  practical  situations,  that  is,  when  the  echoes 


(d) 


Fig.  37.  Effect  of  a  short-pass  system,  (a)  Input  sequence. 

(b)  Output  for  short-pass  system  of  (d).  (c)  Com¬ 
plex  cepstrum  for  (a).  (Note:  all  traces  have  the 
same  time  origin  as  f(n).) 
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overlap  significantly  with  s,  this  method  generally  requires  that  we  remove  too  much 
of  §  to  obtain  a  very  good  approximation  to  s  at  the  output  of  the  inverse  system  D-1. 

A  simple  example  of  this  is  given  in  Fig.  32e,  where  a  short-pass  system  that  was  equal 
to  1  only  up  to  n  =  16  was  used  to  recover  3.  Another  example  is  given  in  Fig.  37.  In 
Fig.  37a  we  show  a  segment  of  a  speech  waveform  (without  echoes)  which  has  l  een 
weighted  with  a  Hamming  window.  In  Fig.  37c  we  show  the  complex  cepstrum  of  this 
segment,  and  in  Fig.  37d  we  show  a  short-pass  system.  The  output  of  system  D  1  for 
this  case  is  shown  in  Fig.  37b.  (The  origin  of  all  traces  is  in  line  with  that  of  j£(n).)  By 
comparing  Fig.  37a  and  37b,  we  observe  that  the  two  are  almost  identical  for  values  of 
n  <  50.  (The  time  axis  in  all  plots  coincides  with  d.)  For  n  >  50  we  observe  that 
Fig.  37b  differs  significantly  from  the  original  waveform.  That  this  is  true  in  general 
for  short-pass  systems  can  be  shown  from  our  discussion  in  section  2.  7. 

Recall  that  for  finite-length  sequences  with  no  linear  phase  component 

m.  m 

^  /  — l  \  ® 

X(z)  =  A  II  (l-a^z  J  II  (1-b^z). 
k—  1  1 

Thus  x(n)  is  zero  outside  the  interval  -mQ  <  n  <§  m..  Recall  that  x  and  y  may  be  written 

X  =  X  .  ®  x 

mm  max 


y  =  ^min  ®  '’max' 


where 


*min<n>  =  *<n) 

n  £  0 

=  0 

n  <  0, 

*max<n>  = 

n  <  0 

=0 

n  £0. 

We  also  see  that  for  short-pass  systems,  as 

>'min<n)  ■  xmtn(n) 

0  <  n  «  n 

c 

«  0 

n  <  0 

*  1W“> 

n  <  n 
c 

and 

W"i  *  W 

-m  ^  n  «  0 
o 

=  0 

elsewhere. 
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It  is  relatively  easy  to  see  from  these  results  that 


=  0 

n  <  m 

0 

y(n)  =  x(n) 

-m  <  n  <  n  -  m 
o  CO 

*  x(n) 

n  -  m  <  n. 
c  o 

In  the  example  of  Fig.  ,  mQ  =  259  and  m.  =  253.  Since  nfi  =  96,  we  see  that  the 
result  above  predicts  that  y(n)  will  be  different  from  x(n)  for  all  n  >  -163.  This  is  not 
detectable  visually,  until  approximately  n=  0.  This  is  so  because  for  n>nc-mQ,  ymax(n) 
is  relatively  small,  and  it  is  only  for  larger  values  of  n  that  this  error  in  ymax  is 
reflected  in  y(n). 

In  general,  the  short-pass  system  is  not  as  useful  in  echo  removal  as  we  might  hope, 
since  in  cases  wherein  the  echoes  significantly  overlap  we  are  required  to  remove  too 
much  of  the  complex  cepstrum  of  the  desired  output. 

The  alternative  to  the  short-pass  system  is  the  comb  system.  In  this  case,  we  need 
much  more  information  about  the  impulse  train  in  order  to  choose  the  values  of  n^  and 
An^.  We  do  not  need  to  know  the  sizes  of  the  echoes  but  either  we  must  know  their  loca¬ 
tions  or  the  locations  of  the  impulses  in  the  complex  cepstrum.  If  the  echoes  have  been 
distorted  by  a  linear  system  (as  discussed  in  section  5.  3) ,  we  also  need  to  know  the 
approximate  duration  of  the  impulse  response  in  order  to  choose  the  A^.  In  general,  we 
do  not  have  this  much  information  about  the  impulse  train.  Since  th ;  significant  peaks 
that  are  due  to  the  impulse  train  stand  out,  however,  in  the  complex  cepstrum,  except 
for  very  short  echo  delays,  we  may  detect  these  peaks  and  set  the  parameters  of 
the  comb  system  appropriately.  This  can  be  done  in  an  jn-line  computation  system  if 
a  display  screen,  or  other  graphical  output  device,  >a  available,  and  if  the  experi¬ 
menter  is  capable  of  interacting  with  the  program.  This  was  done  in  much  of  the  experi¬ 
mental  work  which  is  reported  here.  It  is  also  clearly  possible  to  program  relatively 
simple  algorithms  for  searching  appropriate  regions  of  the  complex  cepstrum  for 
peaks  attributable  to  the  impulse  train.  The  information  obtained  from  such  algo¬ 
rithms  can  then  be  used  to  set  the  parameters  of  the  comb  system. 

In  connection  with  such  algorithms,  it  is  worth  pointing  out  that  we  have  found  that 
the  phase  component  of  the  sequence  s  normally  contributes  a  larger  compo.  at  to 
the  complex  cepstrum  than  the  log  magnitude.  Thus,  in  order  to  make  the  detection  of 
impulses  in  the  complex  cepstrum  easier,  it  is  desirable  to  use  only  the  even  part  of 
the  complex  cepstrum.  If  the  impulse  train  is  minimum-phase,  the  impulses  in  the  ven 
part  will  occur  at  the  same  locations  as  those  in  the  complete  complex  cepstrum.  Thus 
detection  of  the  impulses  in  the  even  part  of  the  complex  cepstrum  for  n  >  0  is  equivalent 
to  detection  of  the  impulses  in  the  complex  cepstrum  for  minimum-phase  sequences. 

We  have  discussed  the  basic  forms  of  frequency-invariant  systems  that  can  be  used 
in  echo  detection  and  echo  removal.  Although  the  short-pass  system  seems  to  distort 
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the  desired  output  too  much,  it  is  possible  to  use  a  sort  of  self-adjusting  comb  system 
for  echo  removal.  We  shall  discuss  examples  of  echo  removal  and  echo  detection,  using 
the  ideas  just  formulated. 

5.  5  SHORT-TIME  ECHO  REMOVAL 

We  have  discussed  the  properties  of  the  complex  cepstrum  of  sequences  of  the  form 
x  =  s  ®  p, 

where  s  is  a  sequence  with  values  spaced  at  unit  intervals,  and  p  is  an  impulse  train. 
We  also  showed  how  frequency-invariant  linear  systems  can  be  used  to  recover  either 
s  or  p.  Throughout  all  of  the  previous  discussion,  we  have  assumed  that  the  sequence  x 
was  of  finite  length  and  that  we  were  able  to  compute  the  z-transform  (or  FFT)  of  the 
entire  sequence.  In  some  applications,  for  example,  removal  of  echoes  from  speech 
signals,  the  duration  of  the  signal  and  high  sampling  rate  combine  to  give  sequences 
with  a  great  many  samples.  To  process  such  sequences  all  at  once,  we  are  required 
to  take  the  FFT  of  a  long  sequence.  In  turn,  for  efficient  operation,  the  FFT  would  need 
a  large  amount  of  high-speed  memory  (that  is,  core  storage).  Thus,  we  are  led  to 
inquire  into  the  possibility  of  processing  such  sequences  in  shorter  segments,  and  then 
somehow  putting  these  segments  back  together  to  form  the  complete  output  sequence. 
This  can  indeed  be  done  and  we  shall  give  an  analysis  of  this  procedure. 

We  shall  begin  with  some  definitions.  We  define  the  short-time  z-transform  of  the 
sequence  x  to  bo 

L-l 

X(|,z)=  Y  x(£,n)  z“‘\  (131) 

L-j 

n=0 

where 

x(£,  n)  =  x(^n). 

The  short-time  z-transform  with  window  w  is  defined  to  be 
L-l 

M.-  I  **  ,n)w(n)z“n.  (132) 

n=0 

Clearly,  Xw(£,z)  is  just  the  z-transform  of  a  finite -length,  weighted  segment  of  the 
sequence  x,  and  the  parameter  £  simply  serves  to  specify  which  segment  is  under  con¬ 
sideration.  Therefore,  if  w(n)  #  0  for  0  <  n  <  L,  then 

x(|+n)  = - - —  (£  X  (|,z)  zn~l  dz,  (133) 

2irjw(n)  JC 

where  the  contour  C  may  be  the  unit  circle. 


In  particular ,  if  w  is  an  exponential  window 


w(n)  =  a  0  <  n  <  L, 


we  define 


i-i— i 

X($,z,a)=  £  x(|+n)aVn. 


We  see  that 


XU.z.a)  =  XU,z/a) 


XU.z.l)  =  X(|,z). 


With  these  definitions  in  mind,  let  us  assume  that  the  sequence  x  has  values 


m-i 

x(n)  =  s(n)  +  ^  aj^sjn-i^) , 


where  the  sequence  s  is  a  sequence  with  unit  spacing  of  samples  (such  as  a  speech 
waveform).  The  short-time  z-transform  with  exponential  window  is 

L-l  M-l  L-l 

XU.z.a)  =  £  sU+n)anz"n+  ^  ak  ^  sU+n-ry  anz"n.  (135] 


k=l  n=0 


If  we  let  q  =  n  -  nk,  we  can  write 


L-l  -r 


^  sU+n-r^) 


n-n  \  -nk 
a  z  =  a^a  z 


^  s(£+q)  a^z-^. 


Through  some  simple  manipulations,  (136)  may  be  written 


Ly-l 

£  sU+n-i^)  anz~n  =  akS(4,  z,  a)  a\  ^  +  Ek(£,  z,  a). 


where 


i-/  * 

SU.z.a)-  ^  sU+n)  anz“n 
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and 


Ek(£,z,a)  =  flj^z  ^  ^  s(4+n)  anz‘n 

n=-\ 

,V\l-l  V  ..... 


-cka  z  a  z 


^  s(4+L+n)  a" 

n=-\ 


Thus  Eq.  135  may  be  written 


X(4,z,al  =  S(4, z, a)  P(z/a)  +  E(4,z,a), 


where 


\  ^i,  “O  * 

P(z/a)  =  1  +  )  a,  a  kz  k 


E(4.z,a)=  ^  ER(4,z,a). 


Let  us  pause  and  interpret  (139).  Suppose  that  s  is  finite  length,  and  s(n)  =  o  for 
n  <  0.  If  4  =  0,  and  if  L  is  greater  than  the  total  length  of  the  sequence  x,  the  terms 
Ek(£.  z,  a)  will  all  vanish  in  (139).  Under  these  conditions,  we  are  transforming  all  of 
the  sequence  at  once,  and  we  should  expect  that 

X(0,z,a)  =  X(z/a)  =  S(z/a)  P(z/a),  (140) 

where  X(z)  is  the  z-transform  of  x.  Thus,  the  term  E(4,z,a)  is  appropriately  termed 
"the  error  in  X(4,z,a)."  That  is,  it  is  the  amount  by  which  Eq.  139  fails  to  have  the 
form  of  the  right-hand  side  of  Eq.  140.  The  reason  for  these  errors  can  be  seen  from 
Eq.  138.  Let  us  write  (138)  as 

Ek(£.z,a)  =  Fk(4,z,a)  -a^z"LFk(4+L,z,a),  (141) 

where 


Fk(4,z,a)  =-•  <zka  kz  k  £  s(4+n)  anz"n. 

n=_nk 
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The  first  term  on  the  right  in  (141)  is  seen  to  be  due  to  overlap  of  the  k  echo  from  the 
previous  segment  into  the  present  segment.  That  is,  the  samples  s(|+n-n^)  0  «  n  <  n^ 
do  not  appear  in  the  basic  segment  s(|+n)  0  n  <  L.  The  second  term  on  the  right  in 
(141)  is  due  to  samples  that  appear  in  the  segment  sd+n)  0  n  <  L  and  do  not  appear  in 
the  segment  of  the  echo  sd+n-n^)  0  n  <  L,  We  shall  refer  to  F^d,  z,a)  as  the  error 
at  the  beginning  of  the  segment,  and  -a^z'^F^^+L,  z,a)  as  the  error  at  the  end  of  the 
segment.  We  note  that,  except  for  a  delay  and  multiplication  by  a  constant,  the  error 

at  the  end  of  the  segment  corresponding  to  £  =  CL  is  the  negative  of  the  error  at  the 

o 

beginning  of  the  segment  corresponding  to  %  =  iQ  +  L. 

Now  let  us  consider  the  logarithm  of  X(£,  z,a).  We  define 

£d,z,a)  =  log  [Xd,  z,  a)]  =  log  [Sd,z,a)P(z/a)+Ed,z,a)],  (143) 

and  the  short-time  complex  cepstrum  as 


xd.n.a) 


i-  p  Xd,ejw,a)  ejwndu>. 
J-n 


(144) 


We  can  write  Eq.  143 


Xd,  z.a)  =  log 


Ed.z.a) 

Sd.z.a)  + - 

P(z/a) 

-  J 


+  log  fP(z/a)]. 


(145) 


From  the  second  term  on  the  right  in  (145),  we  see  that  the  short-time  complex  cep¬ 
strum  can  be  thought  of  as  containing  the  same  impulse-train  component  as  me  complex 
cepstrum  of  the  entire  sequence.  This  assumption  implies  certain  restrictions  on  the 
length  of  the  window  and  the  nature  of  the  signal  s.  These  restrictions  will  be  discussed 
below.  If  this  component  were  removed  by  a  comb  system,  the  short-time  transform 
of  the  output  segment  would  be  simply 

Ed,  z,  a) 

Y(l. z.a)  =  Sd.z.a)  +  - - .  (146) 

P(z/a) 


The  samples  of  this  output  segment  are 

y(4+n)  an  =  ^  Y(|,  z, a)  ejwn  du  0  «  n  <  L.  (147) 

The  effect  of  the  exponential  weighting  may  be  removed  by  multiplying  by  a-n,  to  obtain 
the  sequence  of  samples  which  has  the  transform 

Ed.z.l) 

Yd,z,l)  =  Sd.z.l)  + - .  (148) 

P(z) 

From  (148)  it  is  clear  that  in  the  interval  0  ^  n  <  L,  the  output  segment  is  the  sum 
of  the  desired  output  sd+n)  and  a  sequence  related  to  the  error  in  Xd,z,l).  To  see 
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the  nature  of  this  term,  it  is  helpful  to  consider  some  special  caces. 

Example  6 

Assume  that  M  =  2;  that  is,  there  is  only  a  single  echo.  Then  from  Eq.  141  we  have 

E, 

Y<£,z,l)  =-  S<£,z,l)  +— i . 

1  +  1 

Thus  the  output  segment  has  values 

OO 

y(£+n)  =  s(£+n)  +  ^  (~aj)r  e^+n-mj), 
r=0 


for  n?0.  This  example  is  illustrated  in  Fig.  ?8. 


Example  7 


Assume  that  «k 


and  =  kn^. 


Therefore,  from  Eq.  148, 


Y{4,z,l)  =  S(£,z,l)  + 


M-l 

2  E.(4,z,i) 

k=0  K 


M-l 

E 


k  ~knl 
a  z 


k=0 


which  can  be  written 


/  -n\ 

Y(e.z.l)  «S(|,Z,1)  +  fl-  az  M  £  Ek(4,z,l) 

'  7  k=l 


M 

if  a  is  small.  Thus  the  output  segment  has  values 

M-l 

y(6+n)  «  s(£+n)  +  ^  [ek(4+n)  -«ek(£+n-knj)]. 

k=l 

Using  Eq.  141,  we  can  write 
M-l  M-l 


M-l 

y  Ek(5,z,i)=  y  Fk{5,z,D -z"l  y  Fk(e+L,z,D, 

ic™  1  k-  j  k*"  j 


and  using  Eq.  142  and  some  manipulation,  we  obtain 

Fk(£,z,l)  =  az  1  ^  s(£+n)  z  n  +  a2z  1  ^  s(£+n)  z  n 


(--■)  I 

k=l 


-nrl 


n=~n 


M-l  -(M-1>nl 
+  ...  +  a  z 


n=--2n 


1 


-(M-2)nrl 

2  s(|+n)  z“n 

n=-(M-l)nj 


M„ 
a  z 


-Mn.  ^ 

2  s(4+n)  z~n. 


(149) 


n=-Mn 


1 


Careful  examination  of  the  terms  in  (149)  shows  that  all  of  the  terms  on  the  right  except 
the  last  contribute  to  the  error  of  y(|+n)  only  in  the  interval  0  <  n  <  Oj ,  while  the  last 
term  is  quite  small  but  does  contribute  over  the  interval  0  n  <  Mn^,  An  identical 
result  holds  for  the  error  at  the  end  of  the  segment  so  that  the  error  term 
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I 


m. 


M-l 

^  [ek(4+n)-aek(4+n-knI)] 

k=l 

will  be  approximately  zero  everywhere  but  in  the  intervals  0  <  n  <nj,  and  L  <  ncL+nj. 
An  example  of  this  is  shown  in  Fig.  42f  where  the  first  two  traces  correspond  to  0 < n<L, 
and  the  last  two  traces  correspond  to  L  ^  n  <  2L. 

Example  8 

Let  there  be  two  echoes  not  necessarily  equally  spaced.  Then,  from  Eq.  148, 

E,(£,z,l)  +  E_(£,  z,  1) 

Y(|,  z,  1)  =  S(£,z,l)  +-* - - — -  ♦ 

1  n2 
1  +  «Lz  +  o2z 

We  could  proceed  as  before.  In  this  case,  however,  the  equations  become  so  complex  as 
to  be  relatively  useless.  We  can  see  from  Eq.  138  that  the  sequence  corresponding  to 

E^e.z.l)  +  E2(£,z.l) 

will  be  nonzero  only  in  the  intervals  0  <  n  <  n,  and  L  <  n  <  L  +  n_.  This  sequence  is 

(  -n  -"zV1 

then  convolved  with  the  sequence  corresponding  to  I  1  +  az  +  a2z  J  which  will  be 
an  impulse  train.  '  ' 

The  three  previous  examples  have  several  things  in  common.  We  note  that  the  term 

M-l 

E(£,  z,  1)  =  £  Ek(£,z,l) 

k=l 

is  the  amount  by  which  X{£,  z,  1)  fails  to  be  a  product  of  S{£,z,l)  and  P(z)  the  trans¬ 
form  of  the  impulse  train.  We  have  called  this  the  error  in  X(£,  z,  1).  Similarly,  we 
note  that 

E(£,z,l) 

P<z) 

is  the  amount  by  which  Y{£,z,l)  fails  to  be  the  desired  output  S(£,z,l).  We  shall  refer 
to  this  as  the  error  in  Y(£,z,l).  Note  that  we  use  the  term  "error"  in  a  slightly  dif¬ 
ferent  sense  in  this  case.  The  error  in  the  output  c«.'n  be  thought  of  as  the  error  in  the 
input,  passed  through  the  inverse  system  for  the  Impulse  train  that  represents  the 
echoes.  We  have  seen  that  the  error  in  the  output  consists  of  two  segments,  one  of  which 
is  primarily  in  the  interval  0  «  n  <  L,  and  a  second  segment  which  is  in  the  region  L«S  n. 
We  have  also  seen  that  in  general  these  errors  tend  to  become  small  for  large  n.  Fur¬ 
thermore,  the  error  in  the  interval  L  «£  n  for  the  segment  £  =  £q  is  the  negative  of  the 
error  in  the  interval  0  n  <  L  for  the  segment  £  =  £q  +  L.  This  fact  will  be  used  when 
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we  discuss  putting  the  output  segments  back  together. 

In  obtaining  the  previous  results  it  was  assumed  that  the  short-time  complex  cep- 
strum  can  be  thought  of  as  containing  a  component  that  is  equal  to  the  complex  cepstrum 
of  the  impulse  train  that  produces  the  echoes.  In  order  for  this  to  be  true,  the  length 
of  the  segment  L  must  be  large  compared  with  n^_j ,  the  longest  echo  time.  Further¬ 
more,  the  signal  s  must  change  character  over  the  segment  of  interest.  These  points 
are  clear  intuitively,  but  quantitative  results  appear  to  be  difficult  to  formulate.  Clearly, 
we  want  the  error  segments  in  the  input  to  be  short  compared  with  the  total  length  L. 
Since  the  length  of  these  error  segments  is  equal  to  the  longest  echo  delay,  we  require 

L  »nM-T 

This  means  that  the  corresponding  errors  in  the  output  will  be  concentrated  primarily 
in  the  two  intervals  0  <  n  <  L  and  L  <  n  <  2L. 

With  respect  to  the  character  of  the  signal  s,  let  us  consider  an  example.  Suppose 
s  is  a  sine  wave  so  that 

s(n)  =  sin  (n)  n?0 

=  0  n  <  0. 

If  the  sequence  x  has  values 
x(n)  =  s(n)  +  as(n-n1), 

it  is  clear  that  all  segments  of  x  for  which  i  >  will  look  just  like  a  sine  wave  with 
some  phase  shift.  That  is, 

x(n)  =  sin  n  +  a  sin  (n-n^) 

=  (1  +  a  cosnj)  sin  n  -  sin  nj  cos  n  n  ^  ^ 

=  A  sin  (n+0). 

All  periodic  sequences  suffer  the  same  difficulty,  as  well  as  exponential  sequences  of 
the  form  an  n  ^  0.  In  all  of  these  cases,  the  short-time  complex  cepstrum  will  not 
exhibit  an  impulse  train  because  of  the  echo.  Speech  signals,  for  example,  change 
character  as  time  progresses.  Clearly,  the  requirements  on  the  character  of  the  wave¬ 
form  and  length  of  the  segment  L  are  interrelated.  If  we  make  the  segment  long  enough, 
almost  any  nonperiodic  signal  will  change  sufficiently  across  the  segment. 

The  previous  results  were  based  on  the  ^-transform,  whereas  we  shall  realize  such 
processing  by  using  the  FFT.  This,  of  course,  means  that  the  short-time  complex  cep¬ 
strum  that  we  compute  will  be  aliased,  and  will  have  values  given  by 

»  N-l  2* 

x(|,n,a)=  Y  x(£,n+rN,a)  =-^-  Y  log  [X(|,k,a)]  e  N 

r=-»  k=0 
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where  N  >  L,  and 

N-l  j  2ir  ^ 

X(i,k,a)  =  £  x(£,n)  an  e  N  . 
n=0 

The  resulting  output  of  the  system  after  removing  the  impulse  train  caused  by  the 
echoes  is  approximately  the  exponentially  weighted  output 

00 

1  a^Vtn+rN). 

r-— oo 

If  we  unweight  with  a~n  in  the  interval  0  ^  n  <  N,  we  obtain  approximately 

y(i,  n)  «  y($,  n)  +  aNy(4,  n+N)  (150) 

for  0  ^  n  <  N.  All  indices  are  taken  modulo  N. 

From  (150)  we  see  that  if  N  =  L,  the  error  in  y(£,n)  is  primarily  at  the  beginning  of 

the  segment,  since  it  is  composed  primarily  of  the  error  in  the  beginning  of  y(£,n)  plus 

N 

the  error  at  the  end  of  a  y(£,  n+N).  If  we  choose  N  =  2L,  the  error  in  y(£,n)  is  approxi- 

<v 

mately  the  error  in  y(£,n).  Thus,  in  this  case,  we  can  apply  most  of  the  prer  nt  results 
even  though  aliasing  does  occur. 


0  L  2L  31.  4L  5L  6L  71 


Fig.  39.  Correction  method  for  short-time  echo  removal. 


Having  discussed  the  actual  computation  of  the  short-time  complex  cepstrum  and 
resulting  output  segment,  we  are  able  to  indicate  how  the  output  segments  may  be  put 
together  to  form  the  total  output  sequence.  V/e  have  noted  that  if  the  errors  in  the 
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output  tend  to  approach  zero,  the  error  in  the  interval  L  <  n  <  2L  for  the  segment  4=  4Q 
is  approximately  the  negative  of  the  error  in  the  interval  0  <  n  <  L  for  the  segment  4  = 

4o  +  L.  This  suggests  that  we  can  move  the  window  along  in  jumps  of  L  samples.  Each 
time  we  save  the  output  for  L  ^  N  <  2L  from  the  segment  4  =  40  so  that  it  may  be  added 
to  the  values  in  the  interval  0  S  n  <  L  of  the  segment  4  =  40  +  L.  This  method  is  illus¬ 
trated  in  Fig.  39.  We  refer  to  this  method  as  the  "correction  method." 

As  an  alternative  to  the  above  scheme,  we  note  that  if  the  segment  is  quite  long  com¬ 
pared  to  the  longest  echo  time,  the  error  is  generally  small  over  some  interval 
L>0  <  n  <  L  so  that  we  may  retain  only  this  part  which  is  relatively  free  of  error.  This 


Fig.  40.  Overlap  method  of  short-time  echo  removal. 


scheme  is  shown  in  Fig.  40,  and  it  is  referred  to  as  the  "overlap  method."  Examples 
indicating  the  feasibility  of  these  schemes  will  be  given. 


5.  6  REMOVAL  OF  ECHOES  FROM  SPEECH  SIGNALS 


We  shall  present  some  examples  of  the  application  of  the  previous  results.  We  shall 
show  several  examples  of  the  removal  of  computer-simulated  echoes  from  speech  sig¬ 
nals.  In  all  figures,  each  trace  in  a  given  picture  represents  1024  sa  mples  unless  other¬ 
wise  noted.  The  consecutive  traces  represent  consecutive  1024  sample  segments  of  the 
speech  waveform.  The  entire  waveform  corresponds  to  the  sentence,  "A  pot  of  tea  helps 
to  pass  the  evening."  The  sampling  rate  was  10  kHz.  In  all  examples,  we  used  an  expo¬ 
nential  window  with  a  =  0.  9987  and  L  =  2048.  The  value  of  N  for  the  FFT  was  N  =  4096. 
The  segment  of  speech  was,  therefore,  augmented  with  2048  zeros  before  transfor¬ 
mation. 
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(e) 

Fig.  41.  Short-time  echo  removal  for  an  impulse  train 
p(n)  =  6(n)  +  3/4  6(n-500). 

(a)  Signal  s(n).  (First  4096  samples.) 

(b)  Signal  x(n)  =  s(n)  ®  p(n). 

(c)  Complex  cepstrum  for  the  first  2048  samples  of  x. 

(d)  Output  for  the  first  2048  samples. 

(e)  Error  for  the  second  2048  samples. 


Ill 


r 

2 


5 
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Example  9 


The  speech  signal  was  convolved  with 
p(n)  =  6(n)  +-—•  6(n-500). 

The  first  4096  samples  of  the  signal  s  and  the  sequence  x  are  shown  in  Fig.  41a  ar.d 
41b,  The  complex  cepstrum  for  the  segment  corresponding  to  %  =  0  is  shown  in 
Fig.  41c.  We  note  the  impulse  appearing  at  n  ^  500.  The  result  of  removing  the 
impulses  at  n  =  500,  1000,  etc. ,  using  a  comb  system,  and  then  transforming  the 
result  with  the  system  D-1  is  shown  in  Fig.  4 Id.  The  waveform  in  the  interval 
0  ^  n  <  2048  is  indistinguishable  by  eye  from  the  first  two  traces  of  Fig.  41  e.  We  also 
note  that  the  error  at  the  end  of  the  segment  appears  repeated  as  predicted  by 
Example  6.  There  is  no  error  at  the  beginning  of  the  segment,  smce  the  speech  signal 
wao  essentially  zero  for  n  <  0.  The  segment  corresponding  to  £  =  2048,  that  is,  the 
second  two  traces  in  Fig.  41b,  was  processed  in  the  same  way,  and  the  difference 
between  the  resulting  output  and  the  original  segment  augmented  with  zeros  (the  error 
in  the  output)  is  shown  m  Fig.  41  e.  We  note  that  this  is  just  the  error  in  the  output,  and 
we  see  clearly  that  the  first  two  traces  in  Fig.  41  e  are  approximately  the  negative  of 
the  last  two  traces  in  Fig.  4 Id.  If  the  last  two  traces  of  Fig.  41d  were  added  to  the  first 
two  traces  in  Fig.  41e  (that  is,  the  segment  for  £  =  2048),  the  error  in  the  interval 
0  «  n  <  L  would  be  essentially  eliminated.  Similarly,  we  could  save  the  last  two  traces 
of  Fig.  41e,  to  use  as  a  correction  for  the  next  segment. 

We  also  can  see  from  Fig.  41e  that  the  error  is  relatively  small  in  the  interval 
1024  «  n  <  2048.  This  suggests  that  we  could  also  use  L  =  N  =  4096,  and  disregard  the 
first  1024  samples  in  each  output  segment,  as  suggested  by  Fig.  40.  In  this  case,  we 
would  obtain  3072  samples  per  segment,  rather  than  2048  as  in  the  correction  method. 

Example  10 

In  this  case,  the  echoes  were  specified  by 


M-l  n 

P(n)  =  y  ( j )  6(n-k500). 
k=0 


The  first  4096  samples  of  the  sequences  s  and  x  are  shown  in  Fig.  42a  and  42b.  The 
short-time  complex  cepstrum  obtained  from  the  first  two  traces  cf  i  g.  42b  augmented 
with  2048  zeros  is  shown  in  Fig.  42c.  The  impulses  at  n  =  500,  1000,  etc.  ,  were 
removed  with  a  comb  system,  and  the  resulting  output  is  shown  in  Fig.  42d.  The  second 
two  traces  in  Fig.  42b  were  processed  similarly,  and  the  difference  between  this  output 
and  the  la?'.  2  traces  of  Fig.  42a  augmented  with  2048  zeros  (the  desired  output)  is  shown 
in  Fig.  42f.  We  note  that  the  error  is  concentrated  in  the  interval  0  «  n  <  500,  as  pre¬ 
dicted  by  Example  7.  We  also  see  that  the  error  in  the  interval  0  «  n  <  2048  in  Fig.  42f 
is  the  negative  of  the  error  in  the  interval  2048  <  n  <  4096  in  Fig.  42d. 
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(e)  (f) 

Fig.  42.  Short-time  echo  removal  for  an  impulse  train 
M-l  . 

p(n)  =  £  (3/f)  6(n-k500). 

k=0 

(a)  Signal  3{n).  (First  4096  samples.) 

(b)  Signal  x(n)  =  s(n)  g  p(n). 

(c)  Complex  cepstrum  for  the  first  2048  samples  of  x. 

(d)  Output  for  *he  first  2048  samples. 

(e)  Output  for  th >'  second  2048  samples. 

(f)  Error  for  th<  second  2048  samples. 
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Example  11 


As  a  final  example,  the  speech  signal  was  convolved  with 
p(n)  =  6(n)  +  |  6(n-320)  +~  6(n-500). 

Again,  the  first  2048  samples  of  the  resulting  sequence  were  augmented  with  2048  zeros 
and  processed  as  before.  The  error  for  the  first  output  segment  is  shown  in  Fig.  43a. 
The  error  for  the  second  segment  (£  =  2048)  is  shown  in  Fig.  43b.  In  Fig.  43c  we  show 
the  sum  of  the  last  two  traces  of  Fig.  43a  and  the  first  two  traces  of  Fig.  43b.  We  note 
that  the  errors  clearly  tend  to  cancel.  Significant  error  would  remain  in  the  second 
segment,  however.  This  error  is  primarily  due  to  the  fact  that  the  window  is  not  long 
enough  relative  to  the  echo  time.  (To  see  the  size  of  the  error  relative  to  the  desired 
output  see  Fig.  41a.) 

All  of  the  previous  examples  have  indicated  that  it  is  possible  to  put  the  output  seg¬ 
ments  back  together  in  a  meaningful  way  either  by  using  the  correction  method  or  the 
overlap  method.  This  was  done,  in  fact,  for  several  different  variations  of  echo  times, 
and  the  resulting  output  speech  was  converted  to  analog  form  for  listening.  Informal 
listening  tests  showed  that  if  a  suitable  value  of  L  is  chosen,  echoes  can  be  removed 
from  speech  signals  by  using  these  techniques.  The  processed  speech  was  slightly  more 
noisy  than  the  input  speech.  This  noise  in  the  output  is  attributable  to  the  fact  that 
there  are  small  errors  remaining  in  each  segment,  as  in  Fig.  43c. 

5.  7  EFFECT  OF  ADDITIVE  NOISE 

The  examples  that  we  have  shown  were  carried  out  on  signals  with  a  high  signal- 
to-noi3e  ratio.  Suppose  that  we  have  a  sequence  x  which  is  of  the  form 

x  =  s  ®  p  r  g, 

where  s  is  the  desired  signal,  p  is  an  impulse  train,  and  g  is  an  additive  noise 
sequence.  The  short-time  transform  of  a  segment  of  the  sequence  x  is 

X(£.z)  =  S(£,z)  P(z)  +  E<£,z)  +  G(£,z), 

where  S(£,z)  is  the  short-time  transform  of  the  signal,  P(*)  is  a  polynomial  in  z 
tl^t  is  the  transform  of  the  impulse  train,  E(£,z)  is  the  error  as  defined  in  sec¬ 
tion  5.5,  and  G(£,z)  is  the  short-time  transform  of  the  noise.  If  the  noise  level  is 
low,  we  may  again  assume  that  the  short-time  complex  cepstrum  has  a  component 
that  is  due  to  P(z)  which  can  be  removed  by  a  comb  system.  The  output  sequence 
will  then  be  of  the  form 

E(£.z)  +  G(£, z) 

Y(|,z)  =  S(£,z)  + - .  (151) 

P(z) 

Thus,  in  addition  to  the  errors  previously  discussed,  we  see  that  the  noise  in  the 
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Fig.  43.  Short-time  echo  removal  for  the  impulse  train 
p(n)  =  5{n)  +  (3/4)  6(n-320>  +  (l/2)  6(n-500). 

(a)  Error  in  the  output  for  the  first  2048  samples  of  x(n)  = 
s(n)  ®  p(n).  (s(n)  is  the  same  as  in  Fig.  41.) 

(b)  Error  in  the  output  for  the  second  2048  samples  of  x. 

(c)  Sum  of  the  second  two  traces  of  (a)  and  the  first  two 
traces  of  (b). 


Fig.  44.  Short-time  echo  detection 
M-l 

p(n)  =  ^  (3/4)k  6(n-k832). 
k=0 

(a)  Complex  cepstrum  of  the  second  2048  samples  of  x(n)  - 
s(n)  ®  p(n).  (s(n)  is  as  in  Fig.  41a.) 

(b)  Output  for  a  long-pass  system. 
p(n)  =  6(n)  +  h(n-832) 

h{n)  =  (3/4)n/4  n?0 

=  0  n  <  0 

(c)  Complex  cepstrum  for  the  second  2048  samples  of  x. 

(d)  Output  for  a  long-pass  system. 

p(n)  =  6(n)  +  (3/4)  6(n-832)  +  (1/2)  6(n-1536). 

(e)  Complex  cepstrum  for  the  second  2048  samples  of  x. 

(f)  Output  for  a  long-pass  system. 
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input  segment  also  appears  in  the  output.  We  note  that  the  noise  is  also  effectively 
filtered  by  the  linear  system  corresponding  to  l/P(z). 

To  test  the  effectiveness  of  short-time  echo  removal  at  low  signal-to-noise  ratios, 
we  added  a  white  Gaussian  noise  sequence  to  the  sequence  s®  p,  where  p  was  an 
impulse  train  with  equal  spacing.  Signal-to-noise  ratios  as  low  as  1 5  dB  were  used.  The 
sequences  were  processed  as  before,  and  the  resulting  outputs  were  converted  to  analog 
form.  Informal  listening  tests  for  the  speech  signals  showed  that  in  the  examples,  at 
least,  the  echoes  were  removed  and  the  noise  level  of  the  output  was  about  the  same  as 
that  of  the  input.  We  should  point  out,  however,  that  only  limited  experimentation  was 
done,  and  a  clear  understanding  of  the  effect  of  additive  noise  is  yet  to  be  obtained. 

5. 8  DETECTION  OF  ECHOES 

We  have  considered  a  signal  containing  echoes  to  be  represented  by  a  convolution  of 
a  basic  signal  and  an  impulse  train.  We  have  also  seen  that  the  complex  cepstrum  of  an 
impulse  train  is  itself  an  impulse  train.  We  have  shown  how  echoes  may  be  removed 
by  using  a  linear  frequency-invariant  system  (possibly  on  a  short-time  basis).  In  con¬ 
clusion,  let  us  consider  the  problem  of  detection  of  echoes,  that  is,  recovery  of  the 
impulse  train  p(n). 

As  might  be  expected,  the  problem  of  echo  detection  is  not  as  difficult  as  the  problem 
of  extraction  of  the  signal.  We  have  seen  that  in  simple  cases,  the  complex  cepstrum, 
cr  Hs  even  part,  may  be  used  for  echo  detection.  If  the  impulse  train  is  not  equally 
spaced,  however,  the  problem  of  determining  the  number  and  locations  of  all  of  the 
echoes  from  the  complex  cepstrum  becomes  quite  difficult.  It  is  therefore  interesting 
to  consider  some  examples  of  recovery  of  the  impulse  train. 

If  the  impulse  train  is  minimum-phase  (or  has  been  made  minimum-phase  by  expo¬ 
nential  weighting),  then  we  have  seen  that  the  impulses  attributable  to  the  echoes  occur 
only  in  the  region  nSn]f  where  n^  is  the  shortest  echo  delay.  Thus  a  long-pass  system 
that  multiplies  by  zero  for  all  n  <  nj  can  be  used  to  recover  an  approximation  to  the 
impulse  train.  The  calculations  can  be  further  simplified  if  the  impulse  train  is 
minimum-phase,  since  we  can  use  the  log  magnitude  alone  to  compute  the  even  part 
of  the  complex  cepstrum.  Then  the  long -pass  system  can  be  chosen  to  perform  the 
Hilbert  transform  operation  on  the  impulse  train,  as  well  as  remove  most  of  the 
minimum-phase  part  of  the  signal. 

We  have  contended  that  the  short-time  complex  cepstrum  can  be  thought  of  as  having 
a  component  caused  by  the  impulse  train.  If  this  is  so,  then  we  al^o  should  be  able 
to  carry  out  short-time  echo  detection.  To  show  that  this  is  true,  let  us  consider 
some  examples. 

Example  12 

The  speech  signal  was  convolved  with 
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p(n)  = 


M-l 

(3/4)k  6(n-k832). 
k=0 

The  input  was  exponentially  weighted  and  the  even  part  of  the  short-time  complex  cep- 
strum  was  computed  by  using  only  the  log  magnitude  of  the  FFT.  The  length  of  each  seg¬ 
ment  was*  L  =  4096  samples  and  N  =  8192.  In  Fig.  44a  we  show  the  cepstrum  for  the 
second  segment  (£  +  4096),  Each  trace  com  ’ponds  to  1024  samples.  Note  the  impulses 
at  832,  1664,  and  so  forth.  After  multiplying  by  2  for  n^800  and  by  0  for  n  <  800,  the  out¬ 
put  of  the  inverse  system  appears  as  in  Fig.  44b.  We  have  recovered  a  very  good 
approximation  to  p(a). 

Example  13 

The  speech  signal  was  convolved  with 
p(n)  -  6(n)  +  3/4  6(n-832)  +y  6(n-1536). 

As  in  the  previous  example,  L  =  4096  and  N  =  8192.  In  Fig.  44e  and  44f  we  show  the  cep¬ 
strum  and  output  for  the  second  segment  (|  =  4096).  Again,  it  is  clear  that  the  output  is 
a  very  good  approximation  to  p(n). 

Example  14 

The  speech  signal  was  convolved  with 
p(n)  =  6(n)  +  h(n-832), 

where 

h(n)=i(3/4)n  nJO 

=  0  n  <  0. 

In  Fig.  44c  we  show  the  cepstrum,  and  in  Fig,  44d  we  show  the  output  after  processing 
as  in  the  other  examples.  We  see  that  the  impulses  in  the  cepstrum  are  convolved  with 

(M 

the  sequences  '  'h(n),  as  discussed  in  section  5.  3.  The  output  also  shows  that  we  have 
recovered  a  good  approximation  to  p(n).  Note  how  the  impulses  are  dispersed  in  the 
cepstrum.  This,  of  course,  makes  it  even  more  difficult  to  detect  the  echoes  in  the 
cepstrum. 
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VI.  CONCLUSION 


6.1  SUMMARY 

We  have  presented  a  new  approach  to  separating  convolved  signals.  A  detailed  anal¬ 
ysis  of  homomorphic  systems  for  deconvolution  has  been  given,  and  we  have  shown  how 
such  signal  transformations  may  be  realized  by  using  a  digital  computer.  As  an  appli¬ 
cation  we  have  considered  the  class  of  signals  in  which  one  of  the  members  of  the  con¬ 
volution  is  an  impulse  train.  Although  our  examples  have  dealt  primarily  with  speech 
analysis  and  the  removal  of  echoes  from  speech  signals,  it  should  be  emphasized  that 
almost  all  of  our  results,  particularly  in  Section  II,  apply  to  more  general  situations. 
Therefore,  it  is  felt  that  the  point  of  view  that  is  reflected  in  this  work  is  important  and 
our  results  have  demonstrated  that  homomorphic  deconvolution  may  be  a  useful  approach 
in  many  interesting  problems. 

6.2  SUGGESTIONS  FOR  FUTURE  RESEARCH 

Although  some  interesting  results  have  been  obtained,  there  are  still  significant 
questions  that  would  be  worthy  of  further  investigation.  For  example,  it  is  quite  pos¬ 
sible  that  other  computational  realizations  can  be  obtained.  This  comment  is  prompted 
by  our  observation  in  section  2.  7  that  for  sequences  of  length  M,  a  total  of  M  values 
of  the  complex  cepstrum  suffice  to  completely  determine  the  original  sequence.  For 
nonminimum  phase  sequences,  a  direct  method  of  computation  of  the  necessary  values 
of  the  complex  cepstrum  which  would  avoid  aliasing  would  be  a  worthwhile  result. 

Another  issue  is  the  question  of  appropriate  window  functions  to  use  in  short-time 
analysis.  In  speech  analysis,  this  is  an  important  consideration.  It  is  also  possible 
that  other  weighting  sequences  besides  the  exponential  can  be  found  which  would  tend  to 
minimize  the  errors  that  we  have  discussed  with  respect  to  short-time  echo  removal. 

In  both  speech  analysis  and  echo  removal,  it  would  also  be  useful  to  have  more  general 
results  on  the  complex  cepstrum  of  an  impulse  train  v  ith  nonuniform  spacing. 

One  of  the  issues  that  we  have  only  touched  upon  is  the  effect  of  additive  noise. 
Limited  experimental  results  have  been  obtained,  but  adequate  understanding  of  this 
issue  is  a  challenging  problem. 

As  well  as  the  issues  relating  to  carrying  out  homomorphic  deconvolution,  it  is  of 
interest  to  consider  situations  in  which  our  techniques  might  be  successfully  applied. 

The  present  work  has  shown  that  there  are  clearly  advantages  to  homomorphic  deconvo¬ 
lution  when  one  of  the  signals  is  an  impulse  train.  Given  the  ease  with  which  one  can 
think  of  signals  of  this  class,  it  seems  clear  at  this  point  that  the  techniques  presented 
here  should  find  application  in  many  diverse  areas.  For  example,,  it  is  possible 
that  homomorphic  deconvolution  may  be  used  to  obtain  very  accurate  synchronous 
pitch  detection.  Also,  it  appears  that  there  are  possibilities  of  application  to  seis¬ 
mic  signals,  for  both  dereverberation  and  detection.  Still  another  area  may  be  in 
processing  underwater  acoustical  signals. 
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APPENDIX 
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Vector  Space  for  Convolution 

A.  1  DEFINITIONS 
A.  1.  1  Field 

A  field  consists  of  a  set  of  objects  called  scalars  (in  our  case  numbers),  together  with 

25 

two  operations  called  addition  and  multiplication  which  satisfy  the  following  conditions. 

1 .  To  every  pair  of  scalars  a  and  b,  there  corresponds  a  scalar  a  +  b  called  the 
sum  of  a  and  b  such  that 

(a)  a  +  b  =  b  +  a 

(b)  a  +  (b+c)  =  (a+b)  +  c 

(c)  there  is  a  unique  zero  scalar,  such  that  a  +  0  =  a 

(d)  to  every  scalar  a  there  corresponds  a  unique  scalar  -a  such  that  a  +  (-a)  =  0. 

2.  To  every  pair  of  scalars  a  and  b  there  corresponds  a  scalar  ab  called  the 
product  of  a  and  b  such  that 

(a)  ab  =  ba 

(b)  a(bc)  =  (ab)c 

(c)  there  exists  a  unique  scalar  1  called  one  such  that  al  =  a, 

(d)  to  every  nonzero  scalar  a,  there  corresponds  a  unique  scalar  a-1  such  that 
aa"1  =  1 

(e)  a(b+c)  =  ab  +  ac. 

Examples  of  a  field  are  the  sets  of  real  numbers,  of  rational  numbers,  and  of  com¬ 
plex  numbers,  where  addition  and  multiplication  have  their  usual  meaning. 

A.  1. 2  Vector  Space 

A  vector  space  consists  of  a  field  of  scalars,  together  with  a  collection  of  elements 
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called  vectors  having  the  following  properties. 

1.  To  every  pair  of  vectors  x  and  y  there  corresponds  a  vector  x  +  y  called  the 
sum  of  x  and  y  such  that 

(a)  addition  is  commutative,  x  +  y  =  y  +  x 

(b)  addition  is  associative,  w  +  (x+y)  =  (w+x)  +  y 

(c)  there  is  a  unique  zero  vector  such  that  x  +  0  =  x 

(d)  there  is  a  unique  inverse  vector  x  +  (-x)  =  0. 

2.  To  every  scalar  a  and  vector  x  there  corresponds  a  vector  ax  such  that 

(a)  a(bx)  =  (ab)x 

(b)  lx  =  x  for  every  vector  x 

(c)  a(x+y)  =  ax  +  ay 

(d)  (a+b)x  =  ax  +  bx. 

We  shall  verify  that  convolution  is  an  appropriate  operation  for  vector  addition,  and 
attempt  to  clarify  the  meaning  of  scalar  multiplication  for  convolutional  vector  spaces. 


A.  2  CONVOLUTION  AS  VECTOR  ADDITION 


Our  set  of  vectors  is  the  set  of  all  sequences  whose  z  transforms  exist  and  have 
overlapping  regions  of  convergence.  The  operation  of  convolution  is 


vu 

®y=  £  x(k)y<n-k). 


(A.  1) 


k=-» 


By  a  simple  change  of  summation  index,  we  see  that  convolution  is  commutative,  that  is. 


x  ®  y  =  ^  y(k)  x(n-k)  =  y  ®  x. 


k=-oo 


Similarly, 


00  00 


(w  ®  x)  ®  y  =  ^  w(k)  x(m-k)  y(n-m)  =  J  w(k)  ^  x(m-k)  y(n- 


m=-oo  k=-«o 


k=-»  m=-oo 


=  y  w(k)  y  x(m)  y(n-k-m)  =  w  ®  (x  ®  y), 
k=-oo  m=~oo 

so  that  convolution  is  also  associative.  The  zero  vector  is  clearly  the  sequence  6  such 
that 

6(n)  =  0  n  *  0 

=1  n  =  0. 

The  inverse  sequence  for  a  sequence  x  is  simply  the  inverse  z  transform  of  l/X(z), 
where  X(z)  is  the  z  transform  of  x. 

A.  3  SCALAR  MULTIPLICATION 

We  shall  denote  scalar  multiplication  by  ^x.  To  begin  to  see  what  scalar  multipli¬ 
cation  means  ior  convolutional  vector  spaces,  let  us  assume  that  b  is  an  integer.  Con¬ 
ventionally  we  say  that  multiplication  of  a  vector  x  by  an  integer  is  equivalent  to  adding 
the  vector  to  itself  b  times.  This  is  a  direct  consequence  of  the  postulates  because,  for 
example, 

2x  =  (l  +  l)x  =  x  +  x. 

Thus  in  the  case  of  convolution  we  say  that  ’x  corresponds  to  the  convolution  of  x 
with  itself  b  times.  That  is, 

<b>x  -  <b-^x  ®  X. 


where 


x 


6 


and 

M)x  ®x  =  6. 

The  set  of  all  positive  and  negative  integers  does  not  constitute  a  field;  however,  the 
set  of  all  rational  numbers  does.  In  this  case,  it  is  only  slightly  more  difficult  to  inter¬ 
pret  scalar  multiplication.  For  example,  suppose 

A  reasonable  interpretation  in  this  case  is  given  by  the  expression 
x  =  (2>y  =  y  ®  y. 

In  general  for  rational  scalars  we  can  give  the  interpretation 


where  a  and  b  are  integers. 

An  alternative  interpretation  of  the  scalar  multiplication  results  from  consideration 
of  the  z  transform.  For  example,  the  z  transform  of  ^x,  where  b  is  an  integer,  is 
[X(z)]b.  That  is,  X(z)  raised  to  the  btb  power.  In  the  second  case,  we  note  that  for 


we  obtain 

[Y(z)]b  =  [X(z)]a. 

Alternatively,  the  function  [Xfz}^^  is  normally  defined  by 
[X(z)]a/b  =  exp j-j~- log  [X(z)]j , 

where  (X{z)]a//b  is  clearly  a  multivalued  function  of  z.  In  fact,  we  may  define 

[X(z)]b  =  exp{b  log[X{z)]}  (A.  2) 

for  b  a  real  or  complex  number. 

It  is  evident  from  Eq.  A.  2  that  the  definition  of  scalar  multiplication  is  intimately 
related  to  the  proper  definition  of  log  [X(z)].  In  order  that  [X(z)]a  have  a  Laurent 
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expansion  and  thus  be  thought  of  as  a  z  transform,  we  require  that  [X(z)]a  be  singie- 

22 

valued.  This  can  be  accomplished  through  the  concept  of  the  Riemann  surface.  Under 
the  assumption  that  [X(z)]a  is  uniquely  defined,  the  verification  of  the  conditions 
regarding  scalar  multiplication  involves  only  straightforward  manipulations  of  powers 
of  the  z  transform. 
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A  new  approach  to  separating  convolved  signals,  referred  to  as  homomorphic  decon¬ 
volution,  is  presented.  The  class  of  systems  considered  in  this  report  is  a  member  of 
a  larger  class  called  homomorphic  systems,  which  are  characterized  by  a  generalized 
pi-inciple  of  superposition  that  is  analogous  to  the  principle  of  superposition  for  linear 
systems. 

A  detailed  analysis  based  on  the  z-transform  is  given  for  discrete-time  systems 
of  this  class.  The  realization  of  such  systems  using  a  digital  computer  is  also  dis¬ 
cussed  in  detail.  Such  computational  realizations  are  made  possible  through  the  appli¬ 
cation  of  high-speed  Fourier  analysis  techniques. 

As  a  particular  example,  the  method  is  applied  to  the  separation  of  the  compo¬ 
nents  of  a  convolution  m  which  one  of  the  components  is  an  impulse  train.  This  class 
of  signals  is  representative  of  many  interesting  signal-analysis  and  signal-processing 
broblems  such  as  speech  analysis  and  echo  removal  and  detection.  It  is  shown  that 
homomorphic  deconvolution  is  a  useful  approach  to  either  removal  or  detection  of 
echoes. 
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